What does the protobuf text format look like?
SerializationTextProtocol BuffersSerialization Problem Overview
Google Protocol Buffers can not only be serialized in binary format, also be serialized as text, known as textproto. However I can't easily find examples of such text; what would it look like?
Expected answer: an example covering all features allowed by the protobuf IDL/proto file including a sample protobuf packet in textual form.
Serialization Solutions
Solution 1 - Serialization
Done myself:
test.proto
enum MyEnum
{
Default = 0;
Variant1 = 1;
Variant100 = 100;
}
message Test {
required string f1 = 1;
required int64 f2 = 2;
repeated uint64 fa = 3;
repeated int32 fb = 4;
repeated int32 fc = 5 [packed = true];
repeated Pair pairs = 6;
optional bytes bbbb = 7;
extensions 100 to max;
}
message Pair {
required string key = 1;
optional string value = 2;
}
extend Test {
optional bool gtt = 100;
optional double gtg = 101;
repeated MyEnum someEnum = 102;
}
example output:
f1: "dsfadsafsaf"
f2: 234
fa: 2342134
fa: 2342135
fa: 2342136
fb: -2342134
fb: -2342135
fb: -2342136
fc: 4
fc: 7
fc: -12
fc: 4
fc: 7
fc: -3
fc: 4
fc: 7
fc: 0
pairs {
key: "sdfff"
value: "q\"qq\\q\n"
}
pairs {
key: " sdfff2 \321\202\320\265\321\201\321\202 "
value: "q\tqq<>q2&\001\377"
}
bbbb: "\000\001\002\377\376\375"
[gtt]: true
[gtg]: 20.0855369
[someEnum]: Variant1
the program:
#include <google/protobuf/text_format.h>
#include <stdio.h>
#include "test.pb.h"
int main() {
Test t;
t.set_f1("dsfadsafsaf");
t.set_f2(234);
t.add_fa(2342134);
t.add_fa(2342135);
t.add_fa(2342136);
t.add_fb(-2342134);
t.add_fb(-2342135);
t.add_fb(-2342136);
t.add_fc(4);
t.add_fc(7);
t.add_fc(-12);
t.add_fc(4);
t.add_fc(7);
t.add_fc(-3);
t.add_fc(4);
t.add_fc(7);
t.add_fc(0);
t.set_bbbb("\x00\x01\x02\xff\xfe\xfd",6);
Pair *p1 = t.add_pairs(), *p2 = t.add_pairs();
p1->set_key("sdfff");
p1->set_value("q\"qq\\q\n");
p2->set_key(" sdfff2 тест ");
p2->set_value("q\tqq<>q2&\x01\xff");
t.SetExtension(gtt, true);
t.SetExtension(gtg, 20.0855369);
t.AddExtension(someEnum, Variant1);
std::string str;
google::protobuf::TextFormat::PrintToString(t, &str);
printf("%s", str.c_str());
return 0;
}
Binary protobuf of this sample (for completeness):
00000000 0a 0b 64 73 66 61 64 73 61 66 73 61 66 10 ea 01 |..dsfadsafsaf...|
00000010 18 f6 f9 8e 01 18 f7 f9 8e 01 18 f8 f9 8e 01 20 |............... |
00000020 8a 86 f1 fe ff ff ff ff ff 01 20 89 86 f1 fe ff |.......... .....|
00000030 ff ff ff ff 01 20 88 86 f1 fe ff ff ff ff ff 01 |..... ..........|
00000040 2a 1b 04 07 f4 ff ff ff ff ff ff ff ff 01 04 07 |*...............|
00000050 fd ff ff ff ff ff ff ff ff 01 04 07 00 32 10 0a |.............2..|
00000060 05 73 64 66 66 66 12 07 71 22 71 71 5c 71 0a 32 |.sdfff..q"qq\q.2|
00000070 23 0a 14 20 20 20 73 64 66 66 66 32 20 20 d1 82 |#.. sdfff2 ..|
00000080 d0 b5 d1 81 d1 82 20 12 0b 71 09 71 71 3c 3e 71 |...... ..q.qq<>q|
00000090 32 26 01 ff 3a 06 00 01 02 ff fe fd a0 06 01 a9 |2&..:...........|
000000a0 06 ea 19 0c bf e5 15 34 40 b0 06 01 |.......4@...|
000000ac
Note that it's the sample is not completely OK: libprotobuf ERROR google/protobuf/wire_format.cc:1059] Encountered string containing invalid UTF-8 data while parsing protocol buffer. Strings must contain only UTF-8; use the 'bytes' type for raw bytes.
Note that protoc
tool also can decode messages to text, both with the proto file and without:
$ protoc --decode=Test test.proto < test.bin
[libprotobuf ERROR google/protobuf/wire_format.cc:1091] String field 'value' contains invalid UTF-8 data when parsing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
f1: "dsfadsafsaf"
f2: 234
fa: 2342134
fa: 2342135
fa: 2342136
fb: -2342134
fb: -2342135
fb: -2342136
fc: 4
fc: 7
fc: -12
fc: 4
fc: 7
fc: -3
fc: 4
fc: 7
fc: 0
pairs {
key: "sdfff"
value: "q\"qq\\q\n"
}
pairs {
key: " sdfff2 \321\202\320\265\321\201\321\202 "
value: "q\tqq<>q2&\001\377"
}
bbbb: "\000\001\002\377\376\375"
[gtt]: true
[gtg]: 20.0855369
[someEnum]: Variant1
$ protoc --decode_raw < test.bin
1: "dsfadsafsaf"
2: 234
3: 2342134
3: 2342135
3: 2342136
4: 18446744073707209482
4: 18446744073707209481
4: 18446744073707209480
5: "\004\007\364\377\377\377\377\377\377\377\377\001\004\007\375\377\377\377\377\377\377\377\377\001\004\007\000"
6 {
1: "sdfff"
2: "q\"qq\\q\n"
}
6 {
1: " sdfff2 \321\202\320\265\321\201\321\202 "
2: "q\tqq<>q2&\001\377"
}
7: "\000\001\002\377\376\375"
100: 1
101: 0x403415e5bf0c19ea
102: 1
Solution 2 - Serialization
Simplified, output from protoc.exe version 3.0.0 on window7 + cygwin
Demo message
$ cat demo.proto
syntax = "proto3"; package demo; message demo { repeated int32 n=1; }
Create a protobuf binary data
$ echo n : [1,2,3] | protoc --encode=demo.demo demo.proto > demo.bin
Dumping proto data as text
$ protoc --decode=demo.demo demo.proto < demo.bin
n: 1
n: 2
n: 3
And dump even if you don't have the proto definiton
$ protoc --decode_raw < demo.bin
1: "\001\002\003"
Solution 3 - Serialization
An example from an open-source repo https://github.com/google/nvidia_libs_test/blob/master/cudnn_benchmarks.textproto
convolution_benchmark {
label: "NHWC_128x20x20x56x160"
input {
dimension: [128, 56, 20, 20]
data_type: DATA_HALF
format: TENSOR_NHWC
}
}
More examples across GitHub https://github.com/search?q=extension%3Atextproto https://github.com/search?q=extension%3Apbtxt