Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] SDM Debugger/Eclipse client protocol

Roland,

I've worked with some new hardware in the past that only supported C, and until recently CUDA (used for programming GPU's) hasn't supported C++. 

However, the point is that most of the proxy/debug code is already in C, so adopting something that does not provide native C support or requires linking with C++ is not an option as far as I'm concerned. If there is C support for protocol buffers then that is a different matter. I'm not opposed to using it. In fact, I suggested it to Dave in the first place. 

I'm not convinced that protocol buffers will be any faster and/or compact than our representation. They are very similar in representation, except that we don't have a schema. Maybe there's an extra byte or two here and there.

What are you referring to when you talk about runtime versus compile time? We need to support runtime caching of strings because we don't know what they are going to be, however we want to be efficient and avoid sending the same string multiple times. A protocol buffer implementation would need to do the same. What do you mean by dynamic attributes?

I also have reservations about requiring the schemas to be be compiled, but it's not a show stopper. It is an extra headache for anyone working on the proxy/debugger code since it requires protocol buffers to be installed, and an extra step in the build process (with more things to go wrong). I've had many years experience working with Sun's RPC to know what that is like.

The main issue I see is the significant amount of work that would be required to re-implement everything using protocol buffers. Dave is doing the development, so unless someone else is going to step up and volunteer to do the work, I think he ultimately needs to make the call as to the best way to go.

Greg

On Mar 30, 2010, at 7:21 PM, Roland Schulz wrote:

Hi,

Well there are also C implementations: http://wiki.github.com/haberman/upb/http://code.google.com/p/protobuf-c/ (*)

But my point was not to push for any specific protocol or say that Dave's proposal is not good. But to understand what is better about the proposed one compared to existing ones. Since it is not that easy to make it both very compact and fast to parse and still wondering whether the result will be better than by using an existing one.

And also to understand the advantages and disadvantages of the differences. E.g. why having a runtime table and not a compile time table. Is the speed difference in parsing not important for us (because e.g. we will always be limited by connection speed)? Do we need dynamic attributes? Is it good/bad to have a schema (e.g. .proto for protocol buffers).

Roland

(*). Off-topic: What platform doesn't support (any) C++? Even the 8bit AVR has (some) C++ support.




On Tue, Mar 30, 2010 at 5:55 PM, Greg Watson <g.watson@xxxxxxxxxxxx> wrote:
The C++ requirement is a non starter as far as I'm concerned. There are many systems (e.g. embedded or with specialized hardware) that don't have C++ compilers and I don't want to excluded these from PTP support. Both Open MPI and STCI both use C for this reason.

Greg


On Mar 30, 2010, at 4:58 PM, Dave Wootton wrote:


Roland
For keys that are part of the core client/proxy/debugger protocol, we should not ever be passing the keys as strings. We should be passing the keys as their integer enumeration value using the varint data format. For keys which are not part of the core protocol, and for any other string values, we should be passing the string in the message body only one time.  The assumption is that as the sender (proxy in this case) recognizes a string, it checks a string table (array of character strings) and adds new unique strings to this table. If this is the first occurrence of the string in any message, then the string is included in the message body so that the receiver (client in this case) also has the same string. If this is the second or subsequent reference to a string, then the array index in the string table replaces the string value, where the index is an integer encoded in varint format.

The assumption is that the receiver will add strings in messages that it receives to a string table it maintains. Since the expectation is that messages are always read in the same order they were sent and no messages are discarded before being scanned for strings, the string tables on both sides of the connection should be identical. So we pay a penalty for the first use of a string, but subsequent usages should be cheap.

Recompiling the proxies as C++ programs doesn't seem like it should be difficult. The only potential problem with recompilation that I can think of in doing that is that the C++ compiler type checking might be tighter and require some code cleanup, which wouldn't be a bad thing.

Taking advantage of the classes generated by the protocol buffers implementation might be a little messy. Unless these are static classes, code somewhere has to create an object for each of these classes and save the pointer to the object so C code can call them. I think most of the accesses to these classes would be from the proxy support libraries, but there is code at least in the PE proxy which is creating a message then appending parameters to the message. So changes would be required to handle the protocol buffers implementation.

The other reservation I had with the protocol buffers implementation is that this implementation requires defining a message handling class for each unique message format in a special grammar, then running a preprocessor tool to generate the corresponding C++ classes. I'm not sure where this would fit in the whole PTP build process. The classes could be built one time offline, by hand, but then someone has to remember to regenerate the message handling classes if the message format changes.

I think Greg had some other reservations about the use of C++, but I don't remember what they were.
Dave


From: Roland Schulz <roland@xxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/30/2010 12:35 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx





Dave, Greg,

On Tue, Mar 30, 2010 at 10:05 AM, Dave Wootton <dwootton@xxxxxxxxxx> wrote:

Roland

I looked at these protocols again. The primary objective I had was that I wanted to transmit as little data over the connection as I could, since with large systems and large numbers of notifications or with applications that generate lots of stdio output the connection to the Eclipse client could be overloaded. I'd like for PTP to be usable with low bandwidth connections (cable/DSL) if possible. Part of this means transmitting in binary format where possible. The main problem I saw with the protocols in this table was that they include metadata that defines the structure of the data. In some cases, such as protocol buffers, the amount of metadata appears to be small.

I think Protocol Buffers is actually more compact than what you are currently proposing.  As far as I understand attributes would still be send as strings. Protocol Buffers would allow us to send all the attributes in binary and we would not to have to send keys as strings. 

In Protocol Buffers the key is only a 5bit ID. While you can save those 5bits when you always know the exact message format, it prohibits you to have optional fields or repeating fields. As it is in your suggestion the key is a string and thus very much larger than the 5 bits.


The other problem with some of these protocols, including protocol buffers, is that they do not support a C programming API. The PE and LoadLeveler proxies are written in C and are pretty large. These could call a C++ library, but I think calling methods in C++ classes from C code gets a bit cumbersome. Dealing with the accessor methods generated by the protocol buffers tools might be difficult.


Why not just compiling PE and LL in C++? Usually it is very little fixes needed to compile C code in C++? For an other (much larger) project we did this rather quickly.
 
If this protocol was intended to be a data exchange format intended for use by a wider set of tools, I think an existing protocol would be higher priority. In this case, where the only consumers are the PTP client and a set of proxies, I think the scaling requirements take higher priority.
I agree. Just wondering whether the scaling would not be better with Protocol Buffers.

Roland


 
Dave

From: Roland Schulz <roland@xxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/29/2010 06:33 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx






Hi,

if we anyhow change the protocol format wouldn't it make sense to use an existing library then reinventing the wheel?

The requirements I see:
Supports Java and C/C++  
Compact
Fast to parse

There are many libraries which do that (
http://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats). One is
http://code.google.com/apis/protocolbuffers/

Only potential problem I see is that the interface is C++ and not C. But there is really no platform anymore without C++ compiler - so I don't see why we wouldn't want to compile the proxy/sdm with C++.

Roland

On Mon, Mar 29, 2010 at 4:17 PM, Dave Wootton <
dwootton@xxxxxxxxxx> wrote:

Greg

This might work work, but I think it gets a bit complicated in the Eclipse client code if we don't assume the message arguments and event arguments aren't of type String


.I've already modified the args array that is built on the proxy side to contain the existing array of char * and a character array parallel to that defining the type  of the argument as either string or enum. To do this right, the args array should probably be an array of a union between char *, int, enum and bitset *. I implemented the 'type' array as a second array instead of defining a struct containing type flag and a 4 byte int/pointer since I was concerned the compiler would pad this and make it an 8 byte element rather than a 5 byte element, meaning three wasted bytes per argument.


proxy_serialize_msg could be changed to prefix each argument with a type byte. When the client retrieves the message, the code handling that in the ProxyPacket class must recognize the argument type byte and decode the following argument according to type. The rest of the event handling code in the Eclipse client seems to be oriented around treating the arguments as a generic array of String. If we were to change this, I think we end up defining specific constructors for a bunch of events that accept differing sets of arguments based on event type and adding a bunch of more specific event encoding logic.


proxy_deserialize_msg currently pulls each argument out of the message buffer and puts it into the args array as a string. That could be changed to construct the args array with each argument being stored as the proper type.

There's some functions in the SDM  utils/event.c source file that do some parameter validation, get the next string argument out of the message array and convert to the proper type (dbg_str_to_*). I think those get changed to keep the validation, but the conversion becomes a copy or maybe a different conversion.


The Eclipse client seems to be implemented with the assumption that the array of arguments to a command is an array of String, and that the set of parameters associated with an event is also an array of String. For commands sent from the client to the proxy, this probably isn't a problem since the proxy command handler functions currently assume they get an array of char * as a parameter.
The debug commands issued by the client are defined such that they consist of an array of String arguments or a bitset passed as a String. I think this means that all debugger commands sent by the client to the debugger are assumed to be sent as strings and it's up to the handler in the debugger to understand what the real type is and convert accordingly.

Dave

From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/29/2010 12:53 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx







Hi Dave,

I agree that it would be nice if we could be more intelligent about types rather than sending everything as strings. What do you think about adding a byte to each argument to indicate a data type? We currently have key/val, string, and int, but we could also add other types where it would make sense for efficiency.

Other than the corresponding routines in org.eclipse.ptp.proxy.protocol, I can't think of anywhere else in the debugger that would be impacted.

Cheers,
Greg



On Mar 29, 2010, at 10:30 AM, Dave Wootton wrote:



Greg


I looked at the SDM code and think I have additional changes on the proxy side of the connection as follows:

1) sdm_message_send serializes msg->aggregate, msg->src and msg->dest by converting them to ASCII strings. I think I need to convert the aggregate value to varint and the src and dest to an array of byte data The body of the message has already been converted to the new binary protocol by proxy_serialize_msg

2) The aggregate, src and dest need to be converted back to int and bitset in sdm_message_progress. The body of the message gets converted back to message header and args array form in proxy_deserialize_msg.
3) In proxy_deserialize_msg, it looks like each argument gets added to the args array as a string value, where if the string represents an enumeration, the value is reconstructed as key=value

4) DbgDeserializeEvent looks like it is ok as-is. Converting the message from binary format to the existing message header and array of string arguments in proxy_deserialize_msg then parsing the message header and array of strings format into the proper internal variables in DbgDeserializeEvent seems a little inefficient in terms of CPU time. However, if proxy_deserialize_msg was to do anything more intelligent, then I think each argument in the binary message format needs to carry a type specification so it can be properly decoded. There's probably a number of other changes elsewhere in the code if we change the internal message structures to deserialize the message more intelligently.


These are the changes I can find by just reading the code. There might be more that will be found as part of actually changing the code.


Does this seem reasonable?


Dave
From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/24/2010 04:46 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx








Yes, the debugger protocol is in org.eclipse.ptp.proxy.protocol, and the SDM (org.eclipse.ptp.debug.sdm) uses both the proxy and utils libraries. For the C side, take a look in src/client/client_cmds.c and src/utils/event.c.

Greg

On Mar 24, 2010, at 10:27 AM, Dave Wootton wrote:



Greg

I realized that in my rework of the client/proxy protocol I didn't consider SDM debugger communication with the Eclipse client. Does the debugger use the same ProxyPacket class as the proxies use, and does the SDM debugger use the same org.eclipse.ptp.proxy and org.eclipse.ptp.utils libraries as the proxies use? Are there other places where I should look as part of implementing the binary proxy protocol changes?

Dave
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev



_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________

ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev



_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


Back to the top