Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] SDM Debugger/Eclipse client protocol

Just a word of caution:

My refactoring of the Process storage system relies on the mapping between
Processes and Process Attributes to be sparse, i.e. many processes for each
distinct attribute value.  If a particular resource manager violates this assumption
then this could impact scalability.

R^2
--
Randy M. Roberts
work: (505)665-4285

"There are only two industries that refer to their
customers as 'users.'"  -- Edward Tufte

On Mar 31, 2010, at 10:13 AM, Greg Watson wrote:

I'm pretty sure that most resource managers (including PBS) allow resources to be defined by the system administrator, so I don't think that statically defining them in a proto file is flexible enough.

Greg


On Mar 31, 2010, at 11:07 AM, Roland Schulz wrote:

Dave,

On Wed, Mar 31, 2010 at 9:59 AM, Dave Wootton <dwootton@xxxxxxxxxx> wrote:

My focus in reworking the protocol was to try to reduce the amount of data that is sent across the wire from the proxy to the client, and not so much emphasis on compute time spent serializing or de-serializing messages. The reason for trying to minimize the traffic across the network connection is that I think there are two possible sources of high message volume, stdio from the application and process state changes in an application with many parallel tasks.

A few months ago, I ran some profiling tests on the Eclipse client code. I modified the PE proxy to generate an arbitrarily large number of 'process start' messages to simulate the startup of a parallel application with a large number of tasks. I simulated as many as 20,000 tasks while profiling and found that in the current implementation, there was a large non-linear increase in processing time as the number of tasks increased, where the largest increases were in the event handling and UI code. The message parsing was not a significant part of the time. So at the moment, I'm not too concerned with the compute time of parsing messages.

Is the deserialization such a small part, that even after improving the event-handling and UI code it won't be a bottleneck?


There's two parts to the serialization of the messages. The first is the construction of the message which is currently a message header defining the type of the message and the client command it is associated with, and an array of strings where each new argument in the message is appended to the array. The second part is the conversion of this message structure into the binary format I've described in the bug report and on the wiki and sending that message across the network.

De-serializing the message also has two parts. The first is reading the message from the network and converting it back into a message header and array of strings. The second part of the de-serialization is at the point where the message is handled, where an event handler extracts the strings it needs from the message body (string array) and converts them to an appropriate type for subsequent use.

What I've done to date deals with the second part of serialization and the first part of de-serialization. The proxy and client event handling still deals with the message parameters as strings in an array.  There is a performance penalty due to the conversions to strings then to the proper data format that I think needs to be addressed as phase two of the protocol conversion, where the incoming data stream is converted directly to parameters of the appropriate type, skipping the conversion to and from strings.

I spent some time this morning looking at the protocol buffer code. I think I may have a somewhat more compact representation than protocol buffers since I don't see anything in the protocol buffers implementation that deals with the string table concept.
I looked at it some more for PBS and realized that all attributes are known at compile time (contrary to what I claimed in the last email - and called dynamic attributes). Thus at least for PBS for the keys/attribute-names no string table concept would be needed. All attribute names could be defined in the proto file at compile time.
For attribute values a string table would be good also when using protocol buffers.
 
The protocol buffers implementation tags each field in the message body with a field identifier, which is a single byte as long as there are less than 15 unique fields in the message body. In order to implement phase 2 of the serialization, I'm including a single byte to identify type for each parameter. The fact that I'm using varints to encode string lengths rather than using C-style strings may cost me a few more bytes. I'm not sure how protocol buffers is encoding strings.

The other thing I'm doing to reduce message size is that if a message is longer than some threshold then I Huffman encode the message using a constant encoding/decoding table. I think using Huffman encoding is a reasonable tradeoff between CPU time and compression efficiency. I had looked at using zlib, but there were some complications with using that and the Java Gzip classes buffering messages that I could not get around and which would cause problems with an interactive message protocol.
This will help mainly for the strings not in the string table, correct? I think the rest of the binary data probably will benefit very little from compression. 

One thing I realized with this discussion is that I probably do not want to include stdio output in the string table, but to just pass that across as strings all the time. I think the probability of duplicated strings in application output is low and I run thye risk of filling up all memory on the proxy and client side withy string tables.

I think my de-serialization on the client side, at the binary format decoding level is pretty efficient. If I recognize a string, I simply append it to the string table I'm maintaining, which is a simple array (Vector). If I see a string table index, then I use that index as an array subscript to access the proper element in the string table.

Serialization is not optimal, since each string requires a linear search of the string table array to see if the new string is a unique string. This could be made more efficient by using a hash or tree structure, but then I need to come up with a reliable and fast index.
What do you mean with index?
If you use the hash the key would be the string and the value the index number. If the string is already in the hash you directly get the index number. If it is not in the hash you just take the next index number from a counter you keep of used index numbers. 

There is some flexibility regarding optional message fields in the protocol buffers implementation that I don't have. However, since the PTP protocol is relatively simple with the only optional/repeated message parameters at the end of the message, and the message parameters are in a fixed format, I don't see an advantage with that.
If we would use protocol buffers, I would want to define all possible job/queue/etc PBS attributes as fields in their respective message. Thus their would be a lot of optional messages - because many possible attributes are not present in any specific job/queue/etc. Keeping the current format with the key=value list of arguments at the end of the message, when using protocol buffers doesn't make any sense in my opinion. Generating this key=value list is already part of the serialization and thus should be done by the serialization library.

Thus there would be no advantage if the part defined in the proto file would be only the RM independent core protocol. But if there is a proto for each individual RM much less hash/vector lookups would be needed because all attributes/message-fields would be defined at compile time.

So the question is: Should the attributes/message-fields be defined at compile time? Is the performance gain important? Do we want a schema? 

Given the above, plus the additional build complexity, I don't see a benefit to using protocol buffers.

If we don't want a schema (for all of the message not not just the core) then protocol buffers indeed seems like a poor match. Then a schema-less format like BSON would be a much better match.

Roland
 
 

Dave


From:
Roland Schulz <roland@xxxxxxx>
To:
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/30/2010 10:49 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx





Greg, Dave,

On Tue, Mar 30, 2010 at 8:47 PM, Greg Watson <g.watson@xxxxxxxxxxxx> wrote:

However, the point is that most of the proxy/debug code is already in C, so adopting something that does not provide native C support or requires linking with C++ is not an option as far as I'm concerned.
Yes linking with C++ is a pain. But recompiling in C++ should not be difficult. At least this is my experience. And as Dave pointed out, even for pure C code it has the advantage that one gets stricter type check and thus might more easily catch bugs. 
 
If there is C support for protocol buffers then that is a different matter. I'm not opposed to using it. In fact, I suggested it to Dave in the first place. 

I'm not convinced that protocol buffers will be any faster and/or compact than our representation. They are very similar in representation, except that we don't have a schema. Maybe there's an extra byte or two here and there.
Regarding compact: I agree. It looks very similar. As a mentioned originally it sounded quite different.  
Regarding performance: I'm not sure. Different libraries have very different performance. See e.g. http://www.eishay.com/2009/03/more-on-benchmarking-java-serialization.html

And if the GUI receives enough messages the deserialization performance (the longest operation) might get important. 


What are you referring to when you talk about runtime versus compile time?
what you call below "schemas to be be compiled" versus index in runtime cached arrays of strings. 
 
We need to support runtime caching of strings because we don't know what they are going to be, however we want to be efficient and avoid sending the same string multiple times. A protocol buffer implementation would need to do the same. What do you mean by dynamic attributes?

If one would use something like protocol buffers then the schema shouldn't only contain the core protocol (the RM independent part). Instead there should be a RM-independent schema which includes an extension point for the specific RM. The RM specific part of the schema would contain all the possible attributes for that RM. 
With dynamic attributes I meant additional attributes which are not in the RM documentation and thus are not know at compile time but might have to be transfered (because it is added e.g. by the administrator). 
 
E.g. for PBS this is the case if one treats each element of the resource_list as an attribute. As the administrator of PBS is able to add resources attributes, those attributes are cluster dependent and not known at compile time. And because they are not known at compile time, and one uses compile time assigned key-IDs with a RM specific schmema, one would need to handle those dynamic attributes extra. E.g. by having a cache of strings and a "others" field in the schema. But in this case the cache of strings would only be used for those dynamic attributes not known at compile time and not for all attributes. 

I also have reservations about requiring the schemas to be be compiled, but it's not a show stopper.
Yes. I think this is the main difference. And it certainly has disadvantages and advantages. The question is whether the advantages (e.g. deserialization performance) or disadvantages are more important.
 
It is an extra headache for anyone working on the proxy/debugger code since it requires protocol buffers to be installed,
 I think we could just include it in the PTP code. Since it is small and the license seems to allow it.

and an extra step in the build process (with more things to go wrong). I've had many years experience working with Sun's RPC to know what that is like.
Could be included in make/ant. 

The main issue I see is the significant amount of work that would be required to re-implement everything using protocol buffers. Dave is doing the development, so unless someone else is going to step up and volunteer to do the work, I think he ultimately needs to make the call as to the best way to go.
I agree. Because we need the protocol also for the PBS RM, we would also have to do some work. And we could help some. But nonetheless Dave should make the call.

Roland
 

Greg

On Mar 30, 2010, at 7:21 PM, Roland Schulz wrote:

Hi,

Well there are also C implementations: http://wiki.github.com/haberman/upb/http://code.google.com/p/protobuf-c/ (*)

But my point was not to push for any specific protocol or say that Dave's proposal is not good. But to understand what is better about the proposed one compared to existing ones. Since it is not that easy to make it both very compact and fast to parse and still wondering whether the result will be better than by using an existing one.

And also to understand the advantages and disadvantages of the differences. E.g. why having a runtime table and not a compile time table. Is the speed difference in parsing not important for us (because e.g. we will always be limited by connection speed)? Do we need dynamic attributes? Is it good/bad to have a schema (e.g. .proto for protocol buffers).

Roland

(*). Off-topic: What platform doesn't support (any) C++? Even the 8bit AVR has (some) C++ support.




On Tue, Mar 30, 2010 at 5:55 PM, Greg Watson <g.watson@xxxxxxxxxxxx> wrote:
The C++ requirement is a non starter as far as I'm concerned. There are many systems (e.g. embedded or with specialized hardware) that don't have C++ compilers and I don't want to excluded these from PTP support. Both Open MPI and STCI both use C for this reason.

Greg


On Mar 30, 2010, at 4:58 PM, Dave Wootton wrote:


Roland

For keys that are part of the core client/proxy/debugger protocol, we should not ever be passing the keys as strings. We should be passing the keys as their integer enumeration value using the varint data format. For keys which are not part of the core protocol, and for any other string values, we should be passing the string in the message body only one time.  The assumption is that as the sender (proxy in this case) recognizes a string, it checks a string table (array of character strings) and adds new unique strings to this table. If this is the first occurrence of the string in any message, then the string is included in the message body so that the receiver (client in this case) also has the same string. If this is the second or subsequent reference to a string, then the array index in the string table replaces the string value, where the index is an integer encoded in varint format.


The assumption is that the receiver will add strings in messages that it receives to a string table it maintains. Since the expectation is that messages are always read in the same order they were sent and no messages are discarded before being scanned for strings, the string tables on both sides of the connection should be identical. So we pay a penalty for the first use of a string, but subsequent usages should be cheap.


Recompiling the proxies as C++ programs doesn't seem like it should be difficult. The only potential problem with recompilation that I can think of in doing that is that the C++ compiler type checking might be tighter and require some code cleanup, which wouldn't be a bad thing.


Taking advantage of the classes generated by the protocol buffers implementation might be a little messy. Unless these are static classes, code somewhere has to create an object for each of these classes and save the pointer to the object so C code can call them. I think most of the accesses to these classes would be from the proxy support libraries, but there is code at least in the PE proxy which is creating a message then appending parameters to the message. So changes would be required to handle the protocol buffers implementation.


The other reservation I had with the protocol buffers implementation is that this implementation requires defining a message handling class for each unique message format in a special grammar, then running a preprocessor tool to generate the corresponding C++ classes. I'm not sure where this would fit in the whole PTP build process. The classes could be built one time offline, by hand, but then someone has to remember to regenerate the message handling classes if the message format changes.


I think Greg had some other reservations about the use of C++, but I don't remember what they were.

Dave


From: Roland Schulz <roland@xxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/30/2010 12:35 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx






Dave, Greg,

On Tue, Mar 30, 2010 at 10:05 AM, Dave Wootton <
dwootton@xxxxxxxxxx> wrote:

Roland

I looked at these protocols again. The primary objective I had was that I wanted to transmit as little data over the connection as I could, since with large systems and large numbers of notifications or with applications that generate lots of stdio output the connection to the Eclipse client could be overloaded. I'd like for PTP to be usable with low bandwidth connections (cable/DSL) if possible. Part of this means transmitting in binary format where possible. The main problem I saw with the protocols in this table was that they include metadata that defines the structure of the data. In some cases, such as protocol buffers, the amount of metadata appears to be small.

I think Protocol Buffers is actually more compact than what you are currently proposing.  As far as I understand attributes would still be send as strings. Protocol Buffers would allow us to send all the attributes in binary and we would not to have to send keys as strings. 

In Protocol Buffers the key is only a 5bit ID. While you can save those 5bits when you always know the exact message format, it prohibits you to have optional fields or repeating fields. As it is in your suggestion the key is a string and thus very much larger than the 5 bits.


The other problem with some of these protocols, including protocol buffers, is that they do not support a C programming API. The PE and LoadLeveler proxies are written in C and are pretty large. These could call a C++ library, but I think calling methods in C++ classes from C code gets a bit cumbersome. Dealing with the accessor methods generated by the protocol buffers tools might be difficult.


Why not just compiling PE and LL in C++? Usually it is very little fixes needed to compile C code in C++? For an other (much larger) project we did this rather quickly.
 

If this protocol was intended to be a data exchange format intended for use by a wider set of tools, I think an existing protocol would be higher priority. In this case, where the only consumers are the PTP client and a set of proxies, I think the scaling requirements take higher priority.

I agree. Just wondering whether the scaling would not be better with Protocol Buffers.

Roland


 

Dave

From: Roland Schulz <roland@xxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/29/2010 06:33 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx







Hi,

if we anyhow change the protocol format wouldn't it make sense to use an existing library then reinventing the wheel?

The requirements I see:
Supports Java and C/C++  
Compact
Fast to parse

There are many libraries which do that (
http://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats). One is
http://code.google.com/apis/protocolbuffers/

Only potential problem I see is that the interface is C++ and not C. But there is really no platform anymore without C++ compiler - so I don't see why we wouldn't want to compile the proxy/sdm with C++.

Roland

On Mon, Mar 29, 2010 at 4:17 PM, Dave Wootton <
dwootton@xxxxxxxxxx> wrote:

Greg

This might work work, but I think it gets a bit complicated in the Eclipse client code if we don't assume the message arguments and event arguments aren't of type String


.I've already modified the args array that is built on the proxy side to contain the existing array of char * and a character array parallel to that defining the type  of the argument as either string or enum. To do this right, the args array should probably be an array of a union between char *, int, enum and bitset *. I implemented the 'type' array as a second array instead of defining a struct containing type flag and a 4 byte int/pointer since I was concerned the compiler would pad this and make it an 8 byte element rather than a 5 byte element, meaning three wasted bytes per argument.


proxy_serialize_msg could be changed to prefix each argument with a type byte. When the client retrieves the message, the code handling that in the ProxyPacket class must recognize the argument type byte and decode the following argument according to type. The rest of the event handling code in the Eclipse client seems to be oriented around treating the arguments as a generic array of String. If we were to change this, I think we end up defining specific constructors for a bunch of events that accept differing sets of arguments based on event type and adding a bunch of more specific event encoding logic.


proxy_deserialize_msg currently pulls each argument out of the message buffer and puts it into the args array as a string. That could be changed to construct the args array with each argument being stored as the proper type.

There's some functions in the SDM  utils/event.c source file that do some parameter validation, get the next string argument out of the message array and convert to the proper type (dbg_str_to_*). I think those get changed to keep the validation, but the conversion becomes a copy or maybe a different conversion.


The Eclipse client seems to be implemented with the assumption that the array of arguments to a command is an array of String, and that the set of parameters associated with an event is also an array of String. For commands sent from the client to the proxy, this probably isn't a problem since the proxy command handler functions currently assume they get an array of char * as a parameter.
The debug commands issued by the client are defined such that they consist of an array of String arguments or a bitset passed as a String. I think this means that all debugger commands sent by the client to the debugger are assumed to be sent as strings and it's up to the handler in the debugger to understand what the real type is and convert accordingly.

Dave
From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/29/2010 12:53 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx








Hi Dave,

I agree that it would be nice if we could be more intelligent about types rather than sending everything as strings. What do you think about adding a byte to each argument to indicate a data type? We currently have key/val, string, and int, but we could also add other types where it would make sense for efficiency.

Other than the corresponding routines in org.eclipse.ptp.proxy.protocol, I can't think of anywhere else in the debugger that would be impacted.

Cheers,
Greg



On Mar 29, 2010, at 10:30 AM, Dave Wootton wrote:



Greg


I looked at the SDM code and think I have additional changes on the proxy side of the connection as follows:

1) sdm_message_send serializes msg->aggregate, msg->src and msg->dest by converting them to ASCII strings. I think I need to convert the aggregate value to varint and the src and dest to an array of byte data The body of the message has already been converted to the new binary protocol by proxy_serialize_msg

2) The aggregate, src and dest need to be converted back to int and bitset in sdm_message_progress. The body of the message gets converted back to message header and args array form in proxy_deserialize_msg.
3) In proxy_deserialize_msg, it looks like each argument gets added to the args array as a string value, where if the string represents an enumeration, the value is reconstructed as key=value

4) DbgDeserializeEvent looks like it is ok as-is. Converting the message from binary format to the existing message header and array of string arguments in proxy_deserialize_msg then parsing the message header and array of strings format into the proper internal variables in DbgDeserializeEvent seems a little inefficient in terms of CPU time. However, if proxy_deserialize_msg was to do anything more intelligent, then I think each argument in the binary message format needs to carry a type specification so it can be properly decoded. There's probably a number of other changes elsewhere in the code if we change the internal message structures to deserialize the message more intelligently.


These are the changes I can find by just reading the code. There might be more that will be found as part of actually changing the code.


Does this seem reasonable?


Dave
From: Greg Watson <g.watson@xxxxxxxxxxxx>
To: Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
Date: 03/24/2010 04:46 PM
Subject: Re: [ptp-dev] SDM Debugger/Eclipse client protocol
Sent by: ptp-dev-bounces@xxxxxxxxxxx









Yes, the debugger protocol is in org.eclipse.ptp.proxy.protocol, and the SDM (org.eclipse.ptp.debug.sdm) uses both the proxy and utils libraries. For the C side, take a look in src/client/client_cmds.c and src/utils/event.c.

Greg

On Mar 24, 2010, at 10:27 AM, Dave Wootton wrote:



Greg

I realized that in my rework of the client/proxy protocol I didn't consider SDM debugger communication with the Eclipse client. Does the debugger use the same ProxyPacket class as the proxies use, and does the SDM debugger use the same org.eclipse.ptp.proxy and org.eclipse.ptp.utils libraries as the proxies use? Are there other places where I should look as part of implementing the binary proxy protocol changes?

Dave
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev



_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________

ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev



_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309

_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


_______________________________________________
ptp-dev mailing list

ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics
cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________

ptp-dev mailing list
ptp-dev@xxxxxxxxxxx

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev




--
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


Back to the top