Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [sumo-user] Unspecified Fatal Error and blank trace logging outputs

Also, how can I run valgrind and traci.start at the same time? Without traci, my script won't interact with the simulation and no error will occur

On Sat, Mar 6, 2021 at 8:15 AM Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx> wrote:
All other executions reported the same error

See their trace file attached


Hi Harald

Soo, no success in reproducing the error via trace file?


Sincerely,

Marcelo d'Almeida


On Sat, Mar 6, 2021 at 3:59 AM Harald Schaefer <fechsaer@xxxxxxxxx> wrote:

Hi Marcelo,

there is a tool valgrind under Linux, which records all memory allocations. It uses many resources (time and memory), but traces all memory allocations and uses. It pin points to situations, where memory is allocated, which is later freed and used again.

A simple call is

    valgrind sumo-git-co/sumo/bin/sumoD debug.sumo.cfg

It works best with the debug version.

Regards, Harald

Am 05.03.21 um 19:21 schrieb Marcelo Andrade Rodrigues D Almeida:
One of the tests finished

NO_REROUTING_THREADS (see attached trace file)

Reading symbols from ../../sumo-git/bin/sumo...(no debugging symbols found)...done.
[New LWP 3412]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `sumo-git/bin/sumo -n /app/scenario/experimental/Bologna_small-0.29.0/joined/joi'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055a5cea65826 in MSVehicle::getBoundingBox() const ()
(gdb) bt
#0  0x000055a5cea65826 in MSVehicle::getBoundingBox() const ()
#1  0x000055a5ceae2e61 in MSLane::detectCollisions(long long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#2  0x000055a5ceac13f4 in MSEdgeControl::detectCollisions(long long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#3  0x000055a5cea1c944 in MSNet::simulationStep() ()
#4  0x000055a5cea1cb56 in MSNet::simulate(long long, long long) ()
#5  0x000055a5cea045ad in main ()


Still waiting on NO_MULTIPROCESS and NO_REROUTING_THREADS_NO_MULTIPROCESS


Sincerely,

Marcelo d'Almeida


On Fri, Mar 5, 2021 at 11:03 AM Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx> wrote:
It seems great!

Thank you!!

On Fri, Mar 5, 2021 at 9:48 AM Jakob Erdmann <namdre.sumo@xxxxxxxxx> wrote:
I think we're getting closer and closer to make tracing work in your use case: https://github.com/eclipse/sumo/issues/8323

Am Fr., 5. März 2021 um 12:34 Uhr schrieb Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx>:
For instance, all but "traci.close()" and "traci.simulationStep()" are duplicated in the second run

On Fri, Mar 5, 2021 at 8:27 AM Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx> wrote:
Hi Jakob

Both "trace_file_label.txt" and "trace_file_new_label.txt" should have the same information since they execute the same code

Currently (in the updated version), they don't.

Something else might be interfering in the trace file generation


Sincerely,

Marcelo d'Almeida


On Thu, Mar 4, 2021 at 6:34 PM Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx> wrote:
Thank you

Tomorrow I'll start new tests with the new development version to debug the original issue

On Thu, Mar 4, 2021, 6:06 PM Jakob Erdmann <namdre.sumo@xxxxxxxxx> wrote:
Thanks for the example. The problem was due to https://github.com/eclipse/sumo/issues/8320

Am Do., 4. März 2021 um 20:16 Uhr schrieb Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx>:
I could reproduce the trace file problem

See attached files





On Thu, Mar 4, 2021 at 3:29 PM Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx> wrote:
Just to clarify

The attached file is one of the finished tests from the last batch (traceGetters False)

But both (this one and the previously traceGetters True) were presenting zero simulation step entries in the trace files.



On Thu, Mar 4, 2021 at 3:21 PM Marcelo Andrade Rodrigues D Almeida <md@xxxxxxxxx> wrote:
I'm redoing the tests with traceGetters set to False to reduce the (huge) file size. Also, I had to restart the tests because someone or something turned off the remote machine overnight.


What I could find so far:

I could retrieve a trace file in the remote server (the huge one) and I found something very odd.

In my trivial test, I found a regular trace file

"traci.start(['/home/marcelo/code/sumo/bin/sumo-gui', '-n', '/home/marcelo/temp2/temp/temp/temp/regular-intersection__right_on_red.net.xml', '-r', '/home/marcelo/temp2/temp/temp/temp/regular-intersection.rou.xml', '--start', 'True'], port=None, label='default')
traci.trafficlight.setPhase('gneJ0', 0)
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.trafficlight.setPhase('gneJ0', 0)
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.simulationStep()
traci.trafficlight.setPhase('gneJ0', 0)
"

In my actual experiment (with multi_processing set to False), all 'traci.simulationStep()' commands are gone (see file attached for complete trace):
"
traci.trafficlight.setRedYellowGreenState('231', 'rrrrGGggrrrGGGgyyyyrrrrGGGGggrrrrrrrrrrrrGGg')
traci.trafficlight.setRedYellowGreenState('231', 'rrrrGGggrrryyyyyyyyrrrrGGGGggrrrrrrrrrrrrGGg')
traci.trafficlight.setRedYellowGreenState('233', 'rryyyyrrrryy')
traci.trafficlight.setRedYellowGreenState('282', 'rrryyyrrryyy')
traci.trafficlight.setRedYellowGreenState('221', 'yyyrrrryyyyyyyrrrrryyy')
traci.trafficlight.setRedYellowGreenState('220', 'GGGrrrrryyyrr')
traci.trafficlight.setRedYellowGreenState('209', 'ryrGGrr')
traci.trafficlight.setRedYellowGreenState('210', 'rrrGGGGGrrrrrrrryyyy')
traci.trafficlight.setRedYellowGreenState('273', 'yyyrrrryy')
traci.vehicle.subscribe('Prati_Capraia_100_70', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Borgo_20_56', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Malvasia_100_70', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Pertini_20_159', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Costa_700_126', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Pepoli_10_199', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Gandhi_40_219', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Audinot_3_16', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Pepoli_10_200', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.vehicle.subscribe('Silvani_7_145', [66, 64, 122, 86, 183, 76, 72, 68, 81, 71, 77, 67, 181])
traci.trafficlight.setRedYellowGreenState('231', 'rrrrGGggrrryyyyrrrrGGrrGGGGggrrrrrrrrrrrrGGg')
traci.trafficlight.setRedYellowGreenState('231', 'rrrrGGggrrrrrrrrrrrGGrrGGGGggrrrrrrrrrrrrGGg')
traci.trafficlight.setRedYellowGreenState('282', 'GGgrrrGGgrrr')
traci.trafficlight.setRedYellowGreenState('220', 'GGGrrrrGrrrrr')
traci.trafficlight.setRedYellowGreenState('209', 'GrGGGrr')
traci.trafficlight.setRedYellowGreenState('210', 'rrrGGGGGrrrrrrGGrrrr')
traci.trafficlight.setRedYellowGreenState('273', 'rrrGGGGrr')
traci.trafficlight.setRedYellowGreenState('233', 'GGrrrrGGGgrr')
traci.trafficlight.setRedYellowGreenState('221', 'rrrGGGGrrrrrrrGGGGGrrr')
"

This was the reported crash from this execution:
#0  0x000055d660c7cf86 in MSVehicle::getBoundingBox() const ()
#1  0x000055d660cfa5b1 in MSLane::detectCollisions(long long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#2  0x000055d660cd8b54 in MSEdgeControl::detectCollisions(long long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#3  0x000055d660c34294 in MSNet::simulationStep() ()
#4  0x000055d660c344a6 in MSNet::simulate(long long, long long) ()
#5  0x000055d660c1c37d in main ()

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

What I could find about the trace file generation problem

The problem is that (without multi-processing) traci is discarding the first run trace file info and keeping traces for the following runs.
When running with multi-processing, all traci simulations are handled as first-run (i.e., new processes) and everything is thrown away.

It doesn't matter if it is a regular or a debug build

I'm don't know why the first run is discarded. I'll keep looking


Any new information, I post here


On Wed, Mar 3, 2021 at 10:08 AM Jakob Erdmann <namdre.sumo@xxxxxxxxx> wrote:
Hi Harald,
the loop with the AnyVehicleIterator should never yield nullptrs. Hence the real bug is someplace else. 
The 4 worker threads in the stacktrace are due to  --device.rerouting.threads', '4', which doesn't really help to explain this (parallel routing typically doesn't cause premature vehicle deletion).
Had the threads come from option --threads, that would have been a likely cause of the issue since we have far fewer tests for this.

neverthless @marcello: Please try running without option --device.rerouting.threads and see if you can still trigger the crash.

Either way, I will probably need a traci-traceFile to fix this.

regards,
Jakob

Am Mi., 3. März 2021 um 13:55 Uhr schrieb Harald Schaefer <fechsaer@xxxxxxxxx>:

Hi Marcelo, hi Jakob,

thanks for the backtraces (looks good)

The problem in this scenario is that MSVehicle::getBoundingBox (this=0x0) is called with a null-Object from this loop:

        for (AnyVehicleIterator veh = anyVehiclesBegin(); veh != anyVehiclesEnd(); ++veh) {
            MSVehicle* collider = const_cast<MSVehicle*>(*veh);
            //std::cout << "   collider " << collider->getID() << "\n";
            PositionVector colliderBoundary = collider->getBoundingBox();

Thread 1 (Thread 0x7fb4974cd780 (LWP 12544)):
#0  0x0000561970425dcc in MSVehicle::getBoundingBox (this=0x0) at /app/sumo-git/src/microsim/MSVehicle.cpp:5925
#1  0x00005619704c23f5 in MSLane::detectCollisions (this=0x561972d88020, timestep=947000, stage="move") at /app/sumo-git/src/microsim/MSLane.cpp:1358

Regards, Harald


_______________________________________________
sumo-user mailing list
sumo-user@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/sumo-user
_______________________________________________
sumo-user mailing list
sumo-user@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/sumo-user
_______________________________________________
sumo-user mailing list
sumo-user@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/sumo-user
_______________________________________________
sumo-user mailing list
sumo-user@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/sumo-user
_______________________________________________
sumo-user mailing list
sumo-user@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/sumo-user

_______________________________________________
sumo-user mailing list
sumo-user@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/sumo-user
_______________________________________________
sumo-user mailing list
sumo-user@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/sumo-user

Back to the top