The C++ requirement is a non starter as far as I'm concerned. There are many systems (e.g. embedded or with specialized hardware) that don't have C++ compilers and I don't want to excluded these from PTP support. Both Open MPI and STCI both use C for this reason.
Greg On Mar 30, 2010, at 4:58 PM, Dave Wootton wrote:
Roland
For keys that are part of the core client/proxy/debugger
protocol, we should not ever be passing the keys as strings. We should
be passing the keys as their integer enumeration value using the varint
data format. For keys which are not part of the core protocol, and for
any other string values, we should be passing the string in the message
body only one time. The assumption is that as the sender (proxy in
this case) recognizes a string, it checks a string table (array of character
strings) and adds new unique strings to this table. If this is the first
occurrence of the string in any message, then the string is included in
the message body so that the receiver (client in this case) also has the
same string. If this is the second or subsequent reference to a string,
then the array index in the string table replaces the string value, where
the index is an integer encoded in varint format.
The assumption is that the receiver
will add strings in messages that it receives to a string table it maintains.
Since the expectation is that messages are always read in the same order
they were sent and no messages are discarded before being scanned for strings,
the string tables on both sides of the connection should be identical.
So we pay a penalty for the first use of a string, but subsequent usages
should be cheap.
Recompiling the proxies as C++ programs
doesn't seem like it should be difficult. The only potential problem with
recompilation that I can think of in doing that is that the C++ compiler
type checking might be tighter and require some code cleanup, which wouldn't
be a bad thing.
Taking advantage of the classes generated
by the protocol buffers implementation might be a little messy. Unless
these are static classes, code somewhere has to create an object for each
of these classes and save the pointer to the object so C code can call
them. I think most of the accesses to these classes would be from the proxy
support libraries, but there is code at least in the PE proxy which is
creating a message then appending parameters to the message. So changes
would be required to handle the protocol buffers implementation.
The other reservation I had with the
protocol buffers implementation is that this implementation requires defining
a message handling class for each unique message format in a special grammar,
then running a preprocessor tool to generate the corresponding C++ classes.
I'm not sure where this would fit in the whole PTP build process. The classes
could be built one time offline, by hand, but then someone has to remember
to regenerate the message handling classes if the message format changes.
I think Greg had some other reservations
about the use of C++, but I don't remember what they were.
Dave
Dave, Greg,
On Tue, Mar 30, 2010 at 10:05 AM, Dave Wootton <dwootton@xxxxxxxxxx>
wrote:
Roland
I looked at these protocols again. The primary objective I had was that
I wanted to transmit as little data over the connection as I could, since
with large systems and large numbers of notifications or with applications
that generate lots of stdio output the connection to the Eclipse client
could be overloaded. I'd like for PTP to be usable with low bandwidth connections
(cable/DSL) if possible. Part of this means transmitting in binary format
where possible. The main problem I saw with the protocols in this table
was that they include metadata that defines the structure of the data.
In some cases, such as protocol buffers, the amount of metadata appears
to be small.
I think Protocol Buffers is actually more compact than
what you are currently proposing. As far as I understand attributes
would still be send as strings. Protocol Buffers would allow us to send
all the attributes in binary and we would not to have to send keys as strings.
In Protocol Buffers the key is only a 5bit ID. While you
can save those 5bits when you always know the exact message format, it
prohibits you to have optional fields or repeating fields. As it is in
your suggestion the key is a string and thus very much larger than the
5 bits.
The other problem with some of these protocols, including protocol buffers,
is that they do not support a C programming API. The PE and LoadLeveler
proxies are written in C and are pretty large. These could call a C++ library,
but I think calling methods in C++ classes from C code gets a bit cumbersome.
Dealing with the accessor methods generated by the protocol buffers tools
might be difficult.
Why not just compiling PE and LL in C++? Usually it is
very little fixes needed to compile C code in C++? For an other (much larger)
project we did this rather quickly.
If this protocol was intended to be
a data exchange format intended for use by a wider set of tools, I think
an existing protocol would be higher priority. In this case, where the
only consumers are the PTP client and a set of proxies, I think the scaling
requirements take higher priority.
I agree. Just wondering whether the scaling would not
be better with Protocol Buffers.
Roland
Dave
Hi,
if we anyhow change the protocol format wouldn't it make sense to use an
existing library then reinventing the wheel?
The requirements I see:
Supports Java and C/C++
Compact
Fast to parse
There are many libraries which do that (http://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats).
One is
http://code.google.com/apis/protocolbuffers/
Only potential problem I see is that the interface is C++ and not C. But
there is really no platform anymore without C++ compiler - so I don't see
why we wouldn't want to compile the proxy/sdm with C++.
Roland
On Mon, Mar 29, 2010 at 4:17 PM, Dave Wootton <dwootton@xxxxxxxxxx>
wrote:
Greg
This might work work, but I think it gets a bit complicated in the Eclipse
client code if we don't assume the message arguments and event arguments
aren't of type String
.I've already modified the args array that is built on the proxy side to
contain the existing array of char * and a character array parallel to
that defining the type of the argument as either string or enum.
To do this right, the args array should probably be an array of a union
between char *, int, enum and bitset *. I implemented the 'type' array
as a second array instead of defining a struct containing type flag and
a 4 byte int/pointer since I was concerned the compiler would pad this
and make it an 8 byte element rather than a 5 byte element, meaning three
wasted bytes per argument.
proxy_serialize_msg could be changed to prefix each argument with a type
byte. When the client retrieves the message, the code handling that in
the ProxyPacket class must recognize the argument type byte and decode
the following argument according to type. The rest of the event handling
code in the Eclipse client seems to be oriented around treating the arguments
as a generic array of String. If we were to change this, I think we end
up defining specific constructors for a bunch of events that accept differing
sets of arguments based on event type and adding a bunch of more specific
event encoding logic.
proxy_deserialize_msg currently pulls each argument out of the message
buffer and puts it into the args array as a string. That could be changed
to construct the args array with each argument being stored as the proper
type.
There's some functions in the SDM utils/event.c source file that
do some parameter validation, get the next string argument out of the message
array and convert to the proper type (dbg_str_to_*). I think those get
changed to keep the validation, but the conversion becomes a copy or maybe
a different conversion.
The Eclipse client seems to be implemented with the assumption that the
array of arguments to a command is an array of String, and that the set
of parameters associated with an event is also an array of String. For
commands sent from the client to the proxy, this probably isn't a problem
since the proxy command handler functions currently assume they get an
array of char * as a parameter.
The debug commands issued by the client are defined such that they consist
of an array of String arguments or a bitset passed as a String. I think
this means that all debugger commands sent by the client to the debugger
are assumed to be sent as strings and it's up to the handler in the debugger
to understand what the real type is and convert accordingly.
Dave
Hi Dave,
I agree that it would be nice if we could be more intelligent about types
rather than sending everything as strings. What do you think about adding
a byte to each argument to indicate a data type? We currently have key/val,
string, and int, but we could also add other types where it would make
sense for efficiency.
Other than the corresponding routines in org.eclipse.ptp.proxy.protocol,
I can't think of anywhere else in the debugger that would be impacted.
Cheers,
Greg
On Mar 29, 2010, at 10:30 AM, Dave Wootton wrote:
Greg
I looked at the SDM code and think I have additional changes on the proxy
side of the connection as follows:
1) sdm_message_send serializes msg->aggregate, msg->src and msg->dest
by converting them to ASCII strings. I think I need to convert the aggregate
value to varint and the src and dest to an array of byte data The body
of the message has already been converted to the new binary protocol by
proxy_serialize_msg
2) The aggregate, src and dest need to be converted back to int and bitset
in sdm_message_progress. The body of the message gets converted back to
message header and args array form in proxy_deserialize_msg.
3) In proxy_deserialize_msg, it looks like each argument gets added to
the args array as a string value, where if the string represents an enumeration,
the value is reconstructed as key=value
4) DbgDeserializeEvent looks like it is ok as-is. Converting the message
from binary format to the existing message header and array of string arguments
in proxy_deserialize_msg then parsing the message header and array of strings
format into the proper internal variables in DbgDeserializeEvent seems
a little inefficient in terms of CPU time. However, if proxy_deserialize_msg
was to do anything more intelligent, then I think each argument in the
binary message format needs to carry a type specification so it can be
properly decoded. There's probably a number of other changes elsewhere
in the code if we change the internal message structures to deserialize
the message more intelligently.
These are the changes I can find by just reading the code. There might
be more that will be found as part of actually changing the code.
Does this seem reasonable?
Dave
Yes, the debugger protocol is in org.eclipse.ptp.proxy.protocol, and the
SDM (org.eclipse.ptp.debug.sdm) uses both the proxy and utils libraries.
For the C side, take a look in src/client/client_cmds.c and src/utils/event.c.
Greg
On Mar 24, 2010, at 10:27 AM, Dave Wootton wrote:
Greg
I realized that in my rework of the client/proxy protocol I didn't consider
SDM debugger communication with the Eclipse client. Does the debugger use
the same ProxyPacket class as the proxies use, and does the SDM debugger
use the same org.eclipse.ptp.proxy and org.eclipse.ptp.utils libraries
as the proxies use? Are there other places where I should look as part
of implementing the binary proxy protocol changes?
Dave_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
--
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
--
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________ ptp-dev mailing list ptp-dev@xxxxxxxxxxx https://dev.eclipse.org/mailman/listinfo/ptp-dev
|