loading...

I'm an Expert in Memory Management & Segfaults, Ask Me Anything!

Jason C. McDonald on September 08, 2018

I'm an expert-level C and C++ developer, with a specialty in memory management. I have experience writing memory-safe code with both the modern saf... [Read Full]
markdown guide
 

Hello, I am learning robotics and using ros-kinetic with gazebo7. I am trying to launch my model in gazebo but got stuck on a "segmentation fault(core dump)" error at

0x00007fffc96ac0ed in ros::NodeHandle::destruct() ()
from /opt/ros/kinetic/lib/libroscpp.so

Kindly advice

 

There are only two ways to debug a segmentation fault, ordinarily:

1) If you have access to the source code for ros-kinetic, you would need to compile it yourself with the -g flag (debug flag), and then try to use it the same way as before. Then, when the segfault occurs, you'll get a file and line number instead of the raw memory address (0x00007fffc96ac0ed), and that will tell you where in the code the segfault is (probably) happening from.

2) To get more information, you can run the code (again, compiled with -g) through a dynamic memory analyser like Valgrind. That will not only give you the file and line number where the segfault is probably occurring, but also a hint about what's going on, and possibly a longer stack track.

Given the information from (1) or (2) (and a snippet of the offending source code), I could probably help you from there.

However, if ros-kinetic is not your project, you'll be best off filing a bug report on their issue tracker.

 

Thanks for the advice. I did compile ros-kinetic from source but now gdb wasn't launching, I don't know why. So i reinstalled ros-kinetic from apt and ran it, gdb was working. Well I did find the source file for the function pointing to segmentation fault :-

void NodeHandle::destruct()
{
  delete collection_;

  boost::mutex::scoped_lock lock(g_nh_refcount_mutex);

  --g_nh_refcount;

  if (g_nh_refcount == 0 && g_node_started_by_nh)
  {
    ros::shutdown();
  }
}

The backtrace went till 24 frames I could provide them too if the fault is not in this part of code.
If you could help me find the error it would really boost my learning.

The stack trace would be really helpful. Also, please be sure to precede your code example with three backticks (`) on the line above the example, and three on the line below.

The stack trace is here :-

#0  0x00007fffc96ac0ed in ros::NodeHandle::destruct() ()
   from /opt/ros/kinetic/lib/libroscpp.so

#1  0x00007fffc96ac269 in ros::NodeHandle::~NodeHandle() ()
   from /opt/ros/kinetic/lib/libroscpp.so

#2  0x00007fff101bf4b4 in realtime_tools::RealtimePublisher<pr2_mechanism_msgs::MechanismStatistics_<std::allocator<void> > >::~RealtimePublisher (
    this=0x7fff0d7a1508, __in_chrg=<optimized out>)
    at /opt/ros/kinetic/include/realtime_tools/realtime_publisher.h:84

#3  pr2_controller_manager::ControllerManager::~ControllerManager (
    this=0x7fff0d7a0e80, __in_chrg=<optimized out>)
    at /home/deadmanlogan/i_am_from_source/ros_catkin_ws/src/pr2_mechanism/pr2_controller_manager/src/controller_manager.cpp:63

#4  0x00007fff101bfed9 in pr2_controller_manager::ControllerManager::~ControllerManager (this=0x7fff0d7a0e80, __in_chrg=<optimized out>)
    at /home/deadmanlogan/i_am_from_source/ros_catkin_ws/src/pr2_mechanism/pr2_controller_manager/src/controller_manager.cpp:67

#5  0x00007fff104ab96a in gazebo::GazeboRosControllerManager::~GazeboRosControllerManager (this=0x7fff0d7a0960, __in_chrg=<optimized out>)
    at /home/deadmanlogan/i_am_from_source/ros_catkin_ws/src/pr2_simulator/pr2_gazebo_plugins/src/gazebo_ros_controller_manager.cpp:85

#6  0x00007fff104abaf6 in gazebo::GazeboRosControllerManager::~GazeboRosControllerManager (this=0x7fff0d7a0960, __in_chrg=<optimized out>)
    at /home/deadmanlogan/i_am_from_source/ros_catkin_ws/src/pr2_simulator/pr2_g---Type <return> to continue, or q <return> to quit---
azebo_plugins/src/gazebo_ros_controller_manager.cpp:94

#7  0x00007ffff5ba80c9 in boost::checked_delete<gazebo::ModelPlugin> (
    x=0x7fff0d7a0960) at /usr/include/boost/core/checked_delete.hpp:34

#8  0x00007ffff5baadd6 in boost::detail::sp_counted_impl_p<gazebo::ModelPlugin>::dispose (this=0x7fff2e023430)
    at /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78

#9  0x00007ffff59b4efe in boost::detail::sp_counted_base::release (
    this=0x7fff2e023430)
    at /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146

#10 0x00007ffff59b4f91 in boost::detail::shared_count::~shared_count (
    this=0x7fff63ffb3d8, __in_chrg=<optimized out>)
    at /usr/include/boost/smart_ptr/detail/shared_count.hpp:443

#11 0x00007ffff5b9f086 in boost::shared_ptr<gazebo::ModelPlugin>::~shared_ptr (
    this=0x7fff63ffb3d0, __in_chrg=<optimized out>)
    at /usr/include/boost/smart_ptr/shared_ptr.hpp:323

#12 0x00007ffff5b9a392 in gazebo::physics::Model::LoadPlugin (
    this=0x7fff3c24d3b0, _sdf=std::shared_ptr (count 3, weak 5) 0x7fff368df840)
    at /home/deadmanlogan/i_am_from_source/Gazebo-7/gazebo/physics/Model.cc:1002

#13 0x00007ffff5b999e2 in gazebo::physics::Model::LoadPlugins (
    this=0x7fff3c24d3b0)
    at /home/deadmanlogan/i_am_from_source/Gazebo-7/gazebo/physics/Model.cc:915

#14 0x00007ffff5c08bd9 in gazebo::physics::World::ProcessFactoryMsgs (this=0x14f2480)
    at /home/deadmanlogan/i_am_from_source/Gazebo-7/gazebo/physics/World.cc:1958

#15 0x00007ffff5c0b9de in gazebo::physics::World::ProcessMessages (
    this=0x14f2480)
    at /home/deadmanlogan/i_am_from_source/Gazebo-7/gazebo/physics/World.cc:2282

#16 0x00007ffff5c0069f in gazebo::physics::World::Step (this=0x14f2480)
    at /home/deadmanlogan/i_am_from_source/Gazebo-7/gazebo/physics/World.cc:688

#17 0x00007ffff5bff06c in gazebo::physics::World::RunLoop (this=0x14f2480)
    at /home/deadmanlogan/i_am_from_source/Gazebo-7/gazebo/physics/World.cc:481

#18 0x00007ffff5c2e413 in boost::_mfi::mf0<void, gazebo::physics::World>::operator() (this=0x128c3b8, p=0x14f2480)
    at /usr/include/boost/bind/mem_fn_template.hpp:49

#19 0x00007ffff5c2d4be in boost::_bi::list1<boost::_bi::value<gazebo::physics::World*> >::operator()<boost::_mfi::mf0<void, gazebo::physics::World>, boost::_bi::list0> (this=0x128c3c8, f=..., a=...) at /usr/include/boost/bind/bind.hpp:253

#20 0x00007ffff5c2b74a in boost::_bi::bind_t<void, boost::_mfi::mf0<void, gazebo::physics::World>, boost::_bi::list1<boost::_bi::value<gazebo::physics::World*> > >::operator() (this=0x128c3b8) at /usr/include/boost/bind/bind.hpp:893

#21 0x00007ffff5c3070a in boost::detail::thread_data<boost::_bi::bind_t<void, boost::_mfi::mf0<void, gazebo::physics::World>, boost::_bi::list1<boost::_bi::value<gazebo::physics::World*> > > >::run (this=0x128c200)at /usr/include/boost/thread/detail/thread.hpp:116

#22 0x00007ffff35c65d5 in boost::(anonymous namespace)::thread_proxy (
    param=<optimized out>) at libs/thread/src/pthread/thread.cpp:168

#23 0x00007ffff79086ba in start_thread (arg=0x7fff63ffd700)
    at pthread_create.c:333

#24 0x00007ffff64c741d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Awesome, and the code you posted earlier, is that the context for /opt/ros/kinetic/include/realtime_tools/realtime_publisher.h:84?

Also, what is the rest of the Valgrind output? Any more details? Segfaults have many causes, so knowing which one was detected helps narrow down the problem.

The code I posted earlier is the context for frame 1&2. Actually this is the gdb output which I posted.

Well I just ran the same in valgrind and it gave:-

deadmanlogan@war:~$ roslaunch pr2_description pr2.launch
... logging to /home/deadmanlogan/.ros/log/12f716ec-e077-11e9-af75-68071520849c/roslaunch-war-13334.log
Checking log directory for disk usage. This may take awhile.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

xacro: Traditional processing is deprecated. Switch to --inorder processing!
To check for compatibility of your document, use option --check-order.
For more infos, see http://wiki.ros.org/xacro#Processing_Order
xacro.py is deprecated; please use xacro instead
started roslaunch server http://war:36573/

SUMMARY
========

PARAMETERS
 * /robo_state_publisher/publish_frequency: 30.0
 * /robot_description: <?xml version="1....
 * /rosdistro: kinetic
 * /rosversion: 1.12.14
 * /use_sim_time: True

NODES
  /
    gazebo (gazebo_ros/gzserver)
    robo_state_publisher (robot_state_publisher/robot_state_publisher)
    urdf_spawner (gazebo_ros/spawn_model)

auto-starting new master
process[master]: started with pid [13348]
ROS_MASTER_URI=http://localhost:11311

setting /run_id to 12f716ec-e077-11e9-af75-68071520849c
process[rosout-1]: started with pid [13361]
started core service [/rosout]
process[gazebo-2]: started with pid [13384]
process[urdf_spawner-3]: started with pid [13386]
process[robo_state_publisher-4]: started with pid [13387]
==13384== Memcheck, a memory error detector
==13384== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==13384== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==13384== Command: /home/deadmanlogan/i_am_from_source/ros_catkin_ws/src/gazebo_ros_pkgs/gazebo_ros/scripts/gzserver -e ode worlds/empty.world __name:=gazebo __log:=/home/deadmanlogan/.ros/log/12f716ec-e077-11e9-af75-68071520849c/gazebo-2.log
==13384== 
SpawnModel script started
[INFO] [1569513769.345868, 0.000000]: Loading model XML from ros parameter
[INFO] [1569513769.363155, 0.000000]: Waiting for service /gazebo/spawn_urdf_model
==13447== Warning: invalid file descriptor -1 in syscall close()
==13448== 
==13448== HEAP SUMMARY:
==13448==     in use at exit: 23,063 bytes in 127 blocks
==13448==   total heap usage: 220 allocs, 93 frees, 96,415 bytes allocated
==13448== 
==13448== LEAK SUMMARY:
==13448==    definitely lost: 0 bytes in 0 blocks
==13448==    indirectly lost: 0 bytes in 0 blocks
==13448==      possibly lost: 0 bytes in 0 blocks
==13448==    still reachable: 23,063 bytes in 127 blocks
==13448==         suppressed: 0 bytes in 0 blocks
==13448== Rerun with --leak-check=full to see details of leaked memory
==13448== 
==13448== For counts of detected and suppressed errors, rerun with: -v
==13448== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13447== 
==13447== HEAP SUMMARY:
==13447==     in use at exit: 14,899 bytes in 127 blocks
==13447==   total heap usage: 219 allocs, 92 frees, 88,235 bytes allocated
==13447== 
==13447== LEAK SUMMARY:
==13447==    definitely lost: 0 bytes in 0 blocks
==13447==    indirectly lost: 0 bytes in 0 blocks
==13447==      possibly lost: 0 bytes in 0 blocks
==13447==    still reachable: 14,899 bytes in 127 blocks
==13447==         suppressed: 0 bytes in 0 blocks
==13447== Rerun with --leak-check=full to see details of leaked memory
==13447== 
==13447== For counts of detected and suppressed errors, rerun with: -v
==13447== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13450== Warning: invalid file descriptor -1 in syscall close()
==13451== 
==13451== HEAP SUMMARY:
==13451==     in use at exit: 23,112 bytes in 129 blocks
==13451==   total heap usage: 226 allocs, 97 frees, 96,533 bytes allocated
==13451== 
==13451== LEAK SUMMARY:
==13451==    definitely lost: 0 bytes in 0 blocks
==13451==    indirectly lost: 0 bytes in 0 blocks
==13451==      possibly lost: 0 bytes in 0 blocks
==13451==    still reachable: 23,112 bytes in 129 blocks
==13451==         suppressed: 0 bytes in 0 blocks
==13451== Rerun with --leak-check=full to see details of leaked memory
==13451== 
==13451== For counts of detected and suppressed errors, rerun with: -v
==13451== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13450== 
==13450== HEAP SUMMARY:
==13450==     in use at exit: 14,948 bytes in 129 blocks
==13450==   total heap usage: 225 allocs, 96 frees, 88,353 bytes allocated
==13450== 
==13450== LEAK SUMMARY:
==13450==    definitely lost: 0 bytes in 0 blocks
==13450==    indirectly lost: 0 bytes in 0 blocks
==13450==      possibly lost: 0 bytes in 0 blocks
==13450==    still reachable: 14,948 bytes in 129 blocks
==13450==         suppressed: 0 bytes in 0 blocks
==13450== Rerun with --leak-check=full to see details of leaked memory
==13450== 
==13450== For counts of detected and suppressed errors, rerun with: -v
==13450== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13453== Warning: invalid file descriptor -1 in syscall close()
==13454== 
==13454== HEAP SUMMARY:
==13454==     in use at exit: 23,131 bytes in 129 blocks
==13454==   total heap usage: 231 allocs, 102 frees, 96,638 bytes allocated
==13454== 
==13454== LEAK SUMMARY:
==13454==    definitely lost: 0 bytes in 0 blocks
==13454==    indirectly lost: 0 bytes in 0 blocks
==13454==      possibly lost: 0 bytes in 0 blocks
==13454==    still reachable: 23,131 bytes in 129 blocks
==13454==         suppressed: 0 bytes in 0 blocks
==13454== Rerun with --leak-check=full to see details of leaked memory
==13454== 
==13454== For counts of detected and suppressed errors, rerun with: -v
==13454== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13453== 
==13453== HEAP SUMMARY:
==13453==     in use at exit: 14,967 bytes in 129 blocks
==13453==   total heap usage: 230 allocs, 101 frees, 88,458 bytes allocated
==13453== 
==13453== LEAK SUMMARY:
==13453==    definitely lost: 0 bytes in 0 blocks
==13453==    indirectly lost: 0 bytes in 0 blocks
==13453==      possibly lost: 0 bytes in 0 blocks
==13453==    still reachable: 14,967 bytes in 129 blocks
==13453==         suppressed: 0 bytes in 0 blocks
==13453== Rerun with --leak-check=full to see details of leaked memory
==13453== 
==13453== For counts of detected and suppressed errors, rerun with: -v
==13453== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13456== Warning: invalid file descriptor -1 in syscall close()
==13457== 
==13457== HEAP SUMMARY:
==13457==     in use at exit: 23,146 bytes in 129 blocks
==13457==   total heap usage: 236 allocs, 107 frees, 96,758 bytes allocated
==13457== 
==13457== LEAK SUMMARY:
==13457==    definitely lost: 0 bytes in 0 blocks
==13457==    indirectly lost: 0 bytes in 0 blocks
==13457==      possibly lost: 0 bytes in 0 blocks
==13457==    still reachable: 23,146 bytes in 129 blocks
==13457==         suppressed: 0 bytes in 0 blocks
==13457== Rerun with --leak-check=full to see details of leaked memory
==13457== 
==13457== For counts of detected and suppressed errors, rerun with: -v
==13457== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13456== 
==13456== HEAP SUMMARY:
==13456==     in use at exit: 14,982 bytes in 129 blocks
==13456==   total heap usage: 235 allocs, 106 frees, 88,578 bytes allocated
==13456== 
==13456== LEAK SUMMARY:
==13456==    definitely lost: 0 bytes in 0 blocks
==13456==    indirectly lost: 0 bytes in 0 blocks
==13456==      possibly lost: 0 bytes in 0 blocks
==13456==    still reachable: 14,982 bytes in 129 blocks
==13456==         suppressed: 0 bytes in 0 blocks
==13456== Rerun with --leak-check=full to see details of leaked memory
==13456== 
==13456== For counts of detected and suppressed errors, rerun with: -v
==13456== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13460== Warning: invalid file descriptor -1 in syscall close()
==13461== 
==13461== HEAP SUMMARY:
==13461==     in use at exit: 23,275 bytes in 131 blocks
==13461==   total heap usage: 242 allocs, 111 frees, 96,968 bytes allocated
==13461== 
==13461== LEAK SUMMARY:
==13461==    definitely lost: 0 bytes in 0 blocks
==13461==    indirectly lost: 0 bytes in 0 blocks
==13461==      possibly lost: 0 bytes in 0 blocks
==13461==    still reachable: 23,275 bytes in 131 blocks
==13461==         suppressed: 0 bytes in 0 blocks
==13461== Rerun with --leak-check=full to see details of leaked memory
==13461== 
==13461== For counts of detected and suppressed errors, rerun with: -v
==13461== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13460== 
==13460== HEAP SUMMARY:
==13460==     in use at exit: 15,111 bytes in 131 blocks
==13460==   total heap usage: 241 allocs, 110 frees, 88,788 bytes allocated
==13460== 
==13460== LEAK SUMMARY:
==13460==    definitely lost: 0 bytes in 0 blocks
==13460==    indirectly lost: 0 bytes in 0 blocks
==13460==      possibly lost: 0 bytes in 0 blocks
==13460==    still reachable: 15,111 bytes in 131 blocks
==13460==         suppressed: 0 bytes in 0 blocks
==13460== Rerun with --leak-check=full to see details of leaked memory
==13460== 
==13460== For counts of detected and suppressed errors, rerun with: -v
==13460== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13463== Warning: invalid file descriptor -1 in syscall close()
==13464== 
==13464== HEAP SUMMARY:
==13464==     in use at exit: 23,278 bytes in 131 blocks
==13464==   total heap usage: 247 allocs, 116 frees, 97,149 bytes allocated
==13464== 
==13464== LEAK SUMMARY:
==13464==    definitely lost: 0 bytes in 0 blocks
==13464==    indirectly lost: 0 bytes in 0 blocks
==13464==      possibly lost: 0 bytes in 0 blocks
==13464==    still reachable: 23,278 bytes in 131 blocks
==13464==         suppressed: 0 bytes in 0 blocks
==13464== Rerun with --leak-check=full to see details of leaked memory
==13464== 
==13464== For counts of detected and suppressed errors, rerun with: -v
==13464== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13463== 
==13463== HEAP SUMMARY:
==13463==     in use at exit: 15,114 bytes in 131 blocks
==13463==   total heap usage: 246 allocs, 115 frees, 88,969 bytes allocated
==13463== 
==13463== LEAK SUMMARY:
==13463==    definitely lost: 0 bytes in 0 blocks
==13463==    indirectly lost: 0 bytes in 0 blocks
==13463==      possibly lost: 0 bytes in 0 blocks
==13463==    still reachable: 15,114 bytes in 131 blocks
==13463==         suppressed: 0 bytes in 0 blocks
==13463== Rerun with --leak-check=full to see details of leaked memory
==13463== 
==13463== For counts of detected and suppressed errors, rerun with: -v
==13463== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13466== Warning: invalid file descriptor -1 in syscall close()
==13467== 
==13467== HEAP SUMMARY:
==13467==     in use at exit: 23,388 bytes in 131 blocks
==13467==   total heap usage: 252 allocs, 121 frees, 97,367 bytes allocated
==13467== 
==13467== LEAK SUMMARY:
==13467==    definitely lost: 0 bytes in 0 blocks
==13467==    indirectly lost: 0 bytes in 0 blocks
==13467==      possibly lost: 0 bytes in 0 blocks
==13467==    still reachable: 23,388 bytes in 131 blocks
==13467==         suppressed: 0 bytes in 0 blocks
==13467== Rerun with --leak-check=full to see details of leaked memory
==13467== 
==13467== For counts of detected and suppressed errors, rerun with: -v
==13467== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13466== 
==13466== HEAP SUMMARY:
==13466==     in use at exit: 15,224 bytes in 131 blocks
==13466==   total heap usage: 251 allocs, 120 frees, 89,187 bytes allocated
==13466== 
==13466== LEAK SUMMARY:
==13466==    definitely lost: 0 bytes in 0 blocks
==13466==    indirectly lost: 0 bytes in 0 blocks
==13466==      possibly lost: 0 bytes in 0 blocks
==13466==    still reachable: 15,224 bytes in 131 blocks
==13466==         suppressed: 0 bytes in 0 blocks
==13466== Rerun with --leak-check=full to see details of leaked memory
==13466== 
==13466== For counts of detected and suppressed errors, rerun with: -v
==13466== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13469== Warning: invalid file descriptor -1 in syscall close()
==13470== 
==13470== HEAP SUMMARY:
==13470==     in use at exit: 23,391 bytes in 131 blocks
==13470==   total heap usage: 257 allocs, 126 frees, 97,588 bytes allocated
==13470== 
==13470== LEAK SUMMARY:
==13470==    definitely lost: 0 bytes in 0 blocks
==13470==    indirectly lost: 0 bytes in 0 blocks
==13470==      possibly lost: 0 bytes in 0 blocks
==13470==    still reachable: 23,391 bytes in 131 blocks
==13470==         suppressed: 0 bytes in 0 blocks
==13470== Rerun with --leak-check=full to see details of leaked memory
==13470== 
==13470== For counts of detected and suppressed errors, rerun with: -v
==13470== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13469== 
==13469== HEAP SUMMARY:
==13469==     in use at exit: 15,227 bytes in 131 blocks
==13469==   total heap usage: 256 allocs, 125 frees, 89,408 bytes allocated
==13469== 
==13469== LEAK SUMMARY:
==13469==    definitely lost: 0 bytes in 0 blocks
==13469==    indirectly lost: 0 bytes in 0 blocks
==13469==      possibly lost: 0 bytes in 0 blocks
==13469==    still reachable: 15,227 bytes in 131 blocks
==13469==         suppressed: 0 bytes in 0 blocks
==13469== Rerun with --leak-check=full to see details of leaked memory
==13469== 
==13469== For counts of detected and suppressed errors, rerun with: -v
==13469== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13472== Warning: invalid file descriptor -1 in syscall close()
==13473== 
==13473== HEAP SUMMARY:
==13473==     in use at exit: 23,499 bytes in 131 blocks
==13473==   total heap usage: 262 allocs, 131 frees, 97,917 bytes allocated
==13473== 
==13473== LEAK SUMMARY:
==13473==    definitely lost: 0 bytes in 0 blocks
==13473==    indirectly lost: 0 bytes in 0 blocks
==13473==      possibly lost: 0 bytes in 0 blocks
==13473==    still reachable: 23,499 bytes in 131 blocks
==13473==         suppressed: 0 bytes in 0 blocks
==13473== Rerun with --leak-check=full to see details of leaked memory
==13473== 
==13473== For counts of detected and suppressed errors, rerun with: -v
==13473== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13472== 
==13472== HEAP SUMMARY:
==13472==     in use at exit: 15,335 bytes in 131 blocks
==13472==   total heap usage: 261 allocs, 130 frees, 89,737 bytes allocated
==13472== 
==13472== LEAK SUMMARY:
==13472==    definitely lost: 0 bytes in 0 blocks
==13472==    indirectly lost: 0 bytes in 0 blocks
==13472==      possibly lost: 0 bytes in 0 blocks
==13472==    still reachable: 15,335 bytes in 131 blocks
==13472==         suppressed: 0 bytes in 0 blocks
==13472== Rerun with --leak-check=full to see details of leaked memory
==13472== 
==13472== For counts of detected and suppressed errors, rerun with: -v
==13472== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13446== Warning: invalid file descriptor -1 in syscall close()
==13475== 
==13475== HEAP SUMMARY:
==13475==     in use at exit: 22,463 bytes in 130 blocks
==13475==   total heap usage: 266 allocs, 136 frees, 98,740 bytes allocated
==13475== 
==13475== LEAK SUMMARY:
==13475==    definitely lost: 0 bytes in 0 blocks
==13475==    indirectly lost: 0 bytes in 0 blocks
==13475==      possibly lost: 0 bytes in 0 blocks
==13475==    still reachable: 22,463 bytes in 130 blocks
==13475==         suppressed: 0 bytes in 0 blocks
==13475== Rerun with --leak-check=full to see details of leaked memory
==13475== 
==13475== For counts of detected and suppressed errors, rerun with: -v
==13475== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==13446== 
==13446== HEAP SUMMARY:
==13446==     in use at exit: 13,282 bytes in 129 blocks
==13446==   total heap usage: 264 allocs, 135 frees, 90,047 bytes allocated
==13446== 
==13446== LEAK SUMMARY:
==13446==    definitely lost: 0 bytes in 0 blocks
==13446==    indirectly lost: 0 bytes in 0 blocks
==13446==      possibly lost: 0 bytes in 0 blocks
==13446==    still reachable: 13,282 bytes in 129 blocks
==13446==         suppressed: 0 bytes in 0 blocks
==13446== Rerun with --leak-check=full to see details of leaked memory
==13446== 
==13446== For counts of detected and suppressed errors, rerun with: -v
==13446== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[ INFO] [1569513770.755880492]: Finished loading Gazebo ROS API Plugin.
[ INFO] [1569513770.756283215]: waitForService: Service [/gazebo/set_physics_properties] has not been advertised, waiting...
[INFO] [1569513771.174828, 0.000000]: Calling service /gazebo/spawn_urdf_model
[ INFO] [1569513771.195916634, 0.023000000]: waitForService: Service [/gazebo/set_physics_properties] is now available.
Warning [parser_urdf.cc:1236] multiple inconsistent <gravity> exists due to fixed joint reduction overwriting previous value [true] with [false].
Warning [parser_urdf.cc:1236] multiple inconsistent <gravity> exists due to fixed joint reduction overwriting previous value [false] with [true].
[ INFO] [1569513772.678038791, 0.125000000]: Laser Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513772.678107422, 0.125000000]: Starting Laser Plugin (ns = /)
[ INFO] [1569513772.680463523, 0.125000000]: Laser Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513772.984139421, 0.125000000]: Camera Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513772.988030718, 0.125000000]: Camera Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513772.989881251, 0.125000000]: Camera Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513772.991364043, 0.125000000]: Camera Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513772.993889515, 0.125000000]: Camera Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513772.999240022, 0.125000000]: Camera Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513773.006935967, 0.125000000]: Camera Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513773.025098634, 0.125000000]: Camera Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513773.042621440, 0.125000000]: bayer simulation maybe computationally expensive.
[ WARN] [1569513773.042710689, 0.125000000]: The <focal_length>[320.000105] you have provided for camera_ [wide_stereo_l_stereo_camera_sensor] is inconsistent with specified image_width [640] and HFOV [1.570800].   Please double check to see that focal_length = width_ / (2.0 * tan(HFOV/2.0)), the explected focal_lengtth value is [319.998825], please update your camera_ model description accordingly.
[ INFO] [1569513773.044398440, 0.125000000]: bayer simulation maybe computationally expensive.
[ WARN] [1569513773.044485198, 0.125000000]: The <focal_length>[320.000105] you have provided for camera_ [wide_stereo_r_stereo_camera_sensor] is inconsistent with specified image_width [640] and HFOV [1.570800].   Please double check to see that focal_length = width_ / (2.0 * tan(HFOV/2.0)), the explected focal_lengtth value is [319.998825], please update your camera_ model description accordingly.
[ INFO] [1569513773.074322637, 0.125000000]: Camera Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513773.078731037, 0.125000000]: Camera Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513773.081757643, 0.125000000]: Camera Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513773.084314022, 0.125000000]: Camera Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513773.098842781, 0.125000000]: trigger_mode trigger_mode streaming
[ WARN] [1569513773.099203446, 0.125000000]: The <focal_length>[320.000105] you have provided for camera_ [l_forearm_cam_sensor] is inconsistent with specified image_width [640] and HFOV [1.570800].   Please double check to see that focal_length = width_ / (2.0 * tan(HFOV/2.0)), the explected focal_lengtth value is [319.998825], please update your camera_ model description accordingly.
[ INFO] [1569513773.775768464, 0.125000000]: Laser Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513773.775826138, 0.125000000]: Starting Laser Plugin (ns = /)
[ INFO] [1569513773.777002086, 0.125000000]: Laser Plugin (ns = /)  <tf_prefix_>, set to ""
[ INFO] [1569513773.798381464, 0.125000000]: Camera Plugin: Using the 'robotNamespace' param: '/'
[ INFO] [1569513773.801075468, 0.125000000]: Camera Plugin (ns = /)  <tf_prefix_>, set to ""
[ WARN] [1569513773.812511053, 0.125000000]: The <focal_length>[320.000105] you have provided for camera_ [r_forearm_cam_sensor] is inconsistent with specified image_width [640] and HFOV [1.570800].   Please double check to see that focal_length = width_ / (2.0 * tan(HFOV/2.0)), the explected focal_lengtth value is [319.998825], please update your camera_ model description accordingly.
[INFO] [1569513773.845363, 0.125000]: Spawn status: SpawnModel: Successfully spawned entity
[ INFO] [1569513773.864237580, 0.125000000]: Physics dynamic reconfigure ready.
[ INFO] [1569513773.912607340, 0.125000000]: starting gazebo_ros_controller_manager plugin in ns: /
[ INFO] [1569513773.913094029, 0.125000000]: Callback thread id=7f4d2487f700
[ INFO] [1569513773.915389709, 0.125000000]: gazebo controller manager plugin is waiting for urdf: //robot_description on the param server.  (make sure there is a rosparam by that name in the ros parameter server, otherwise, this plugin blocks simulation forever).
[ INFO] [1569513774.019387318, 0.125000000]: gazebo controller manager got pr2.xml from param server, parsing it...
[urdf_spawner-3] process has finished cleanly
log file: /home/deadmanlogan/.ros/log/12f716ec-e077-11e9-af75-68071520849c/urdf_spawner-3*.log
Segmentation fault (core dumped)
==13384== 
==13384== HEAP SUMMARY:
==13384==     in use at exit: 12,284 bytes in 120 blocks
==13384==   total heap usage: 214 allocs, 94 frees, 86,495 bytes allocated
==13384== 
==13384== LEAK SUMMARY:
==13384==    definitely lost: 0 bytes in 0 blocks
==13384==    indirectly lost: 0 bytes in 0 blocks
==13384==      possibly lost: 0 bytes in 0 blocks
==13384==    still reachable: 12,284 bytes in 120 blocks
==13384==         suppressed: 0 bytes in 0 blocks
==13384== Rerun with --leak-check=full to see details of leaked memory
==13384== 
==13384== For counts of detected and suppressed errors, rerun with: -v
==13384== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[gazebo-2] process has died [pid 13384, exit code 139, cmd valgrind /home/deadmanlogan/i_am_from_source/ros_catkin_ws/src/gazebo_ros_pkgs/gazebo_ros/scripts/gzserver -e ode worlds/empty.world __name:=gazebo __log:=/home/deadmanlogan/.ros/log/12f716ec-e077-11e9-af75-68071520849c/gazebo-2.log].
log file: /home/deadmanlogan/.ros/log/12f716ec-e077-11e9-af75-68071520849c/gazebo-2*.log

Speaking of any more details, I have Ubuntu xenial and trying to use my "launch file" to launch my robotic model in gazebo7(robotic simulation software) and this simulation software is giving segmentation fault on running my launch file. Since this launch file is readymade from Github I think probably there is no error in that launch file.

What do you think is causing the error based on my provided information?

o.O

Wow, I've never seen this one before. The segfault is occurring, but Valgrind doesn't seem to be catching it.

I'm curious how you're invoking Valgrind. Usually I'd just pass the executable right to it:

$ valgrind roslaunch pr2_description pr2.launch

I invoked valgrind by specifying it as an option in the launch file itself and the same way i invoked gdb.

<node name="gazebo" pkg="gazebo_ros"  type="$(arg script_type)" respawn="$(arg respawn_gazebo)" output="$(arg output)" launch-prefix="valgrind"

I am very stressed with this problem but i don't want to give up.
What do you suggest for this problem?

You know, I'd be really curious to know what would happen if you ran the launch file itself through Valgrind! If you look at the output from a moment ago, there's quite a lot that is occuring outside of Valgrind (all the lines not preceded with ==nnnnn== (where nnnnn is some number). The segfault at the end appears to be occuring outside of that context as well. That leads me to believe the segfault might actually be within the launch file.

I just ran it through valgrind

[deleted]
[deleted]
[deleted]
[deleted]
[deleted]
[deleted]
[deleted]
[deleted]

Yikes. Could you delete that comment chain and put it in a Gist or bpaste.net or some such? It'll be easier to read.

In any case, that confirmed my suspicion; the launcher is the problem. it's not memory pure at all.

[deleted]
[deleted]

After ending the process manually I further got the output

^C[robo_state_publisher-4] killing on exit
[rosout-1] killing on exit
[master] killing on exit
shutting down processing monitor...
... shutting down processing monitor complete
done
==15557== Invalid read of size 4
==15557==    at 0x41964F: PyObject_Free (in /usr/bin/python2.7)
==15557==    by 0x4D07FA: ??? (in /usr/bin/python2.7)
==15557==    by 0x4AA262: ??? (in /usr/bin/python2.7)
==15557==    by 0x4E0C11: ??? (in /usr/bin/python2.7)
==15557==    by 0x4FC2C9: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBADC: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Address 0x62f2020 is 2,592 bytes inside an unallocated block of size 2,768 in arena "client"
==15557== 
==15557== Invalid read of size 4
==15557==    at 0x41964F: PyObject_Free (in /usr/bin/python2.7)
==15557==    by 0x4AA262: ??? (in /usr/bin/python2.7)
==15557==    by 0x4E0C11: ??? (in /usr/bin/python2.7)
==15557==    by 0x4FC2C9: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBADC: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Address 0x7f37020 is 128 bytes inside a block of size 552 free'd
==15557==    at 0x4C2EDEB: free (vg_replace_malloc.c:530)
==15557==    by 0x50C4362: fclose@@GLIBC_2.2.5 (iofclose.c:84)
==15557==    by 0x43CEFB: ??? (in /usr/bin/python2.7)
==15557==    by 0x4A63FD: PyObject_Call (in /usr/bin/python2.7)
==15557==    by 0x5385A5: _PyObject_CallMethod_SizeT (in /usr/bin/python2.7)
==15557==    by 0x53F4CE: ??? (in /usr/bin/python2.7)
==15557==    by 0x4AEF42: PyObject_CallFunctionObjArgs (in /usr/bin/python2.7)
==15557==    by 0x4BF668: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4C2C3B: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4C24E9: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==  Block was alloc'd at
==15557==    at 0x4C2DB8F: malloc (vg_replace_malloc.c:299)
==15557==    by 0x50C4CDC: __fopen_internal (iofopen.c:69)
==15557==    by 0x53D247: ??? (in /usr/bin/python2.7)
==15557==    by 0x4AB6FA: ??? (in /usr/bin/python2.7)
==15557==    by 0x53CDBE: ??? (in /usr/bin/python2.7)
==15557==    by 0x4BD1D9: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4C2C3B: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4C24E9: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4C24E9: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557== 
==15557== Invalid read of size 4
==15557==    at 0x41964F: PyObject_Free (in /usr/bin/python2.7)
==15557==    by 0x4FC2C9: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBADC: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Address 0x7eea020 is 0 bytes inside a block of size 8 free'd
==15557==    at 0x4C2EDEB: free (vg_replace_malloc.c:530)
==15557==    by 0x49B1E4: ??? (in /usr/bin/python2.7)
==15557==    by 0x4D878E: ??? (in /usr/bin/python2.7)
==15557==    by 0x4BD778: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4D6218: ??? (in /usr/bin/python2.7)
==15557==    by 0x4EEC7D: ??? (in /usr/bin/python2.7)
==15557==    by 0x4A63FD: PyObject_Call (in /usr/bin/python2.7)
==15557==    by 0x4C6C2F: PyEval_CallObjectWithKeywords (in /usr/bin/python2.7)
==15557==    by 0x6EB480C: ??? (in /usr/lib/python2.7/lib-dynload/pyexpat.x86_64-linux-gnu.so)
==15557==    by 0x6EBCF3D: ??? (in /usr/lib/python2.7/lib-dynload/pyexpat.x86_64-linux-gnu.so)
==15557==    by 0x710D68F: ??? (in /lib/x86_64-linux-gnu/libexpat.so.1.6.0)
==15557==  Block was alloc'd at
==15557==    at 0x4C2DB8F: malloc (vg_replace_malloc.c:299)
==15557==    by 0x493F0E: PyList_New (in /usr/bin/python2.7)
==15557==    by 0x510D4D: ??? (in /usr/bin/python2.7)
==15557==    by 0x4BD1D9: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4D6218: ??? (in /usr/bin/python2.7)
==15557==    by 0x4EEC7D: ??? (in /usr/bin/python2.7)
==15557==    by 0x4A63FD: PyObject_Call (in /usr/bin/python2.7)
==15557==    by 0x4C6C2F: PyEval_CallObjectWithKeywords (in /usr/bin/python2.7)
==15557==    by 0x6EB480C: ??? (in /usr/lib/python2.7/lib-dynload/pyexpat.x86_64-linux-gnu.so)
==15557==    by 0x6EBCF3D: ??? (in /usr/lib/python2.7/lib-dynload/pyexpat.x86_64-linux-gnu.so)
==15557==    by 0x710D68F: ??? (in /lib/x86_64-linux-gnu/libexpat.so.1.6.0)
==15557== 
==15557== Invalid read of size 4
==15557==    at 0x41964F: PyObject_Free (in /usr/bin/python2.7)
==15557==    by 0x4D0C94: ??? (in /usr/bin/python2.7)
==15557==    by 0x4D086B: ??? (in /usr/bin/python2.7)
==15557==    by 0x4FC2C9: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBADC: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Address 0x611c020 is 48,736 bytes inside a block of size 49,152 free'd
==15557==    at 0x4C2EDEB: free (vg_replace_malloc.c:530)
==15557==    by 0x4AA3C4: ??? (in /usr/bin/python2.7)
==15557==    by 0x495BCA: PyDict_SetItem (in /usr/bin/python2.7)
==15557==    by 0x4FC278: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBADC: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Block was alloc'd at
==15557==    at 0x4C2FB55: calloc (vg_replace_malloc.c:711)
==15557==    by 0x498D2C: ??? (in /usr/bin/python2.7)
==15557==    by 0x4A252E: PyDict_Merge (in /usr/bin/python2.7)
==15557==    by 0x512275: ??? (in /usr/bin/python2.7)
==15557==    by 0x4BD1D9: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4C210E: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4BA915: PyEval_EvalCodeEx (in /usr/bin/python2.7)
==15557==    by 0x4C2C3B: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4C210E: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4C210E: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4C210E: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557==    by 0x4C210E: PyEval_EvalFrameEx (in /usr/bin/python2.7)
==15557== 
==15557== Invalid read of size 4
==15557==    at 0x41964F: PyObject_Free (in /usr/bin/python2.7)
==15557==    by 0x4D0B76: ??? (in /usr/bin/python2.7)
==15557==    by 0x4D086B: ??? (in /usr/bin/python2.7)
==15557==    by 0x495BCA: PyDict_SetItem (in /usr/bin/python2.7)
==15557==    by 0x4FC278: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBBBD: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Address 0x6083020 is 3,200 bytes inside a block of size 3,218 free'd
==15557==    at 0x4C2EDEB: free (vg_replace_malloc.c:530)
==15557==    by 0x4D0A85: ??? (in /usr/bin/python2.7)
==15557==    by 0x4D086B: ??? (in /usr/bin/python2.7)
==15557==    by 0x495BCA: PyDict_SetItem (in /usr/bin/python2.7)
==15557==    by 0x4FC278: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBBBD: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Block was alloc'd at
==15557==    at 0x4C2DB8F: malloc (vg_replace_malloc.c:299)
==15557==    by 0x4A0021: PyString_FromStringAndSize (in /usr/bin/python2.7)
==15557==    by 0x4B3F50: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B425C: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B414F: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B4272: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B3E65: PyMarshal_ReadObjectFromString (in /usr/bin/python2.7)
==15557==    by 0x4B3DE5: PyMarshal_ReadLastObjectFromFile (in /usr/bin/python2.7)
==15557==    by 0x4B3D2D: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B390B: ??? (in /usr/bin/python2.7)
==15557==    by 0x4A4C20: ??? (in /usr/bin/python2.7)
==15557==    by 0x4A42B2: PyImport_ImportModuleLevel (in /usr/bin/python2.7)
==15557== 
==15557== Conditional jump or move depends on uninitialised value(s)
==15557==    at 0x419658: PyObject_Free (in /usr/bin/python2.7)
==15557==    by 0x4D0C94: ??? (in /usr/bin/python2.7)
==15557==    by 0x4D086B: ??? (in /usr/bin/python2.7)
==15557==    by 0x4AA262: ??? (in /usr/bin/python2.7)
==15557==    by 0x4E0C11: ??? (in /usr/bin/python2.7)
==15557==    by 0x4AA0E3: ??? (in /usr/bin/python2.7)
==15557==    by 0x4E0BFB: ??? (in /usr/bin/python2.7)
==15557==    by 0x4AA262: ??? (in /usr/bin/python2.7)
==15557==    by 0x4E0C11: ??? (in /usr/bin/python2.7)
==15557==    by 0x4FC2C9: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBBBD: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557== 
==15557== Invalid read of size 4
==15557==    at 0x502477: PyGrammar_RemoveAccelerators (in /usr/bin/python2.7)
==15557==    by 0x4F8DF3: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Address 0x615a020 is 304 bytes inside a block of size 617 free'd
==15557==    at 0x4C2EDEB: free (vg_replace_malloc.c:530)
==15557==    by 0x4D07FA: ??? (in /usr/bin/python2.7)
==15557==    by 0x4FC2C9: _PyModule_Clear (in /usr/bin/python2.7)
==15557==    by 0x4FBADC: PyImport_Cleanup (in /usr/bin/python2.7)
==15557==    by 0x4F8D83: Py_Finalize (in /usr/bin/python2.7)
==15557==    by 0x4936F1: Py_Main (in /usr/bin/python2.7)
==15557==    by 0x507782F: (below main) (libc-start.c:291)
==15557==  Block was alloc'd at
==15557==    at 0x4C2DB8F: malloc (vg_replace_malloc.c:299)
==15557==    by 0x4A0021: PyString_FromStringAndSize (in /usr/bin/python2.7)
==15557==    by 0x4B3F50: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B407D: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B4272: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B414F: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B4272: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B3E65: PyMarshal_ReadObjectFromString (in /usr/bin/python2.7)
==15557==    by 0x4B3DE5: PyMarshal_ReadLastObjectFromFile (in /usr/bin/python2.7)
==15557==    by 0x4B3D2D: ??? (in /usr/bin/python2.7)
==15557==    by 0x4B390B: ??? (in /usr/bin/python2.7)
==15557==    by 0x4A4C20: ??? (in /usr/bin/python2.7)
==15557== 
==15557== 
==15557== HEAP SUMMARY:
==15557==     in use at exit: 3,393,686 bytes in 5,832 blocks
==15557==   total heap usage: 278,397 allocs, 272,565 frees, 371,732,938 bytes allocated
==15557== 
==15557== LEAK SUMMARY:
==15557==    definitely lost: 0 bytes in 0 blocks
==15557==    indirectly lost: 0 bytes in 0 blocks
==15557==      possibly lost: 55,704 bytes in 96 blocks
==15557==    still reachable: 3,337,982 bytes in 5,736 blocks
==15557==         suppressed: 0 bytes in 0 blocks
==15557== Rerun with --leak-check=full to see details of leaked memory
==15557== 
==15557== For counts of detected and suppressed errors, rerun with: -v
==15557== Use --track-origins=yes to see where uninitialised values come from
==15557== ERROR SUMMARY: 9917 errors from 128 contexts (suppressed: 0 from 0)

So this was the whole output I got , sorry for uploading this in parts(character limitation).

I hope this gives something useful to track down the issue.

I apologise for making such a long comment chain.
I have now made a gist of running the launch file through valgrind in

gist.github.com/rishabh900/41fd6df...

And the above comment is the output after i terminated the process manually.
So what do you think of now?

Did you write the launcher script, or is that third-party? It's clearly written in Python, and the issue is definitely there. I just can't narrow in on the specific issue, because the memory issues are being thrown by the interpreter (e.g. at 0x41964F: PyObject_Free (in /usr/bin/python2.7)). That indicates that something odd has been done within the Python code, but I won't be able to diagnose this further without really fully understanding the launcher's source code, and I'm afraid I don't have time to learn it.

If this is third-party code, open an issue against the launcher project, and include the above output of Valgrind.

 

Great to see a fellow low-level programmer on here!

I worked on a game engine written in C and was having many issues related to wrongly using the realloc function for dynamically allocated memory. What I did was forget to assign the reallocated memory's pointer to the return value of the function. It took me weeks before I found the underlying problem since only in some cases it would blow up. How would you go about debugging a situation like:

int* p = calloc(5, sizeof(int));
// some code
realloc(p, 6 * sizeof(int)); // notice no assignment

Do you use some sort of special tools? Or just some coding standards to not let this happen?

 

Whenever I'm working with memory, I pair two different tools: Valgrind and Goldilocks (PawLIB).

Valgrind is a pretty ubiquitous tool on UNIX platforms which will show me all of the memory issues encountered while running, even if the undefined behavior doesn't cause any overt problems. My code isn't done until it's Valgrind-pure. However, Valgrind only monitors the execution, so...

Goldilocks is a testing framework I developed at MousePaw Media, as a part of PawLIB. You could technically use any testing framework, but the benefit to Goldilocks is that it bakes the tests into the final executable, instead of requiring an additional framework to run the tests. That way, you can start the normal executable, run each of the tests you wrote, and see which ones Valgrind complains about.

Mind you, this does require you to write a lot of comprehensive behavioral tests...but you really should be doing that anyway in production code. ;)

 

My approach for this specific problem is to use a compiler that warns about unused return value, such as gcc or clang. I know that stdlib.h on Linux and Mac OS X already decorates realloc() with warn_unused_result attribute.

stackoverflow.com/a/2889601

But just naively setting p = realloc(p, ...) is also wrong, since if the allocation fails, p would be set to NULL but the original object is still allocated. The original pointer is lost and now a memory leak. Use reallocf() which frees the original memory if it could not be resized.

That's a really nice feature, didn't know about it.

But wouldn't that mean data loss in case the memory can't be resized? Wouldn't that become an unrecoverable error?

@liulk Ha, I completely forgot to mention Clang! It does indeed have the best warnings of any compiler I've used. I almost always compile with -Wall -Wextra -Wpedantic -Werror; that last one (as you know, although the reader might not) causes the build to fail on any warnings.

I also use cppcheck as part of my autoreview workflow, and resolve all linter warnings before committing to the production branch.

@codevault You're right, reallocf() would just free the memory and cause data loss, so it would serve a different use case than realloc(). The more general solution would be to always use this pattern, which is more verbose:

void *q = realloc(p, new_size);
if (q == NULL) {
  // do error handling.
  return;
}
p = q;

I just find that in most of my use cases, I would end up freeing p in the error handling, so I would just use reallocf() which results in less verbose code.

I see, that makes sense. I can see myself freeing the memory most of the time when reallocation fails.

Good to note. Thanks!

 

I should add, I use another tool from PawLIB called IOChannel - basically, a std::cout wrapper - that allows me to cleanly print the address and raw memory from literally any pointer, without having to use a debugger. This can make debugging some problems infinitely easier, especially when you're contending with a Heisenbug that goes away if compiled with -g, but appears when compiled with -O2.

Thanks for the response!

Unfortunately, I didn't find a version of Valgrind for Windows. I tried DrMemory but, after lots of struggle, it didn't give me any helpful information and dropped the ball. Do you have experience with low-level on Windows or just work exclusively on Linux since it is more convenient?

I rarely use Windows for development, as its development toolchain is almost invariably miles behind its UNIX-based counterparts.

If you're on Windows 10, I strongly recommend setting up the Windows Subsystem for Linux [WSL]. That will give you access to the Linux development environment for compiling and testing. Then, use the LLVM Clang compiler on both the WSL and the Visual Studio environments. That way, once you know it compiles and runs Valgrind-pure on WSL, you can trust that it will work on VS Clang.

 

Are rust and golang going to take over C and C++? In terms of desktop software/web development. Also how would you, as a C++ expert, rate these languages? Do they have potential?

 

I strongly believe that (virtually) all languages have their place. FORTRAN and COBOL have firmly established places in the world, and are almost certain never to lose them on account of their reliability and precedence.

C and C++ likewise have this precedence, making up a sizable chunk of our source code. It's the old "if it ain't broke, don't fix it" concept; I doubt the entire collection of software that makes up a standard Linux-based operating system will ever be rewritten from C to Rust, because most of what already exists works quite well.

That said, I think Rust and golang have a lot of potential as languages, especially Rust.

(In my personal opinion, golang is a rather hipster language, but that's based in my feelings towards it, not in anything practical; so take that with a grain of salt.)

Rust looks especially interesting in the area of error handling. I'll admit, I haven't had the time to learn it very well yet, but it's DEFINITELY high on my list!

In other words, Rust and golang will probably find established places in the programming world, but they won't be displacing C, C++, or any other established language. Every tool has its place, and a quarter inch drill bit doesn't replace a 5mm drill bit.

 

"In my personal opinion, golang is a rather hipster language"

Thank. YOU!

 

Rust would take over everything, but the leftist collectivist community threw a baton roue of so-called "golang" at it, and now we will all perish and return to a dark age of literal witch hunts. (Because we are all out of Moore's law)
RIP information technology and human civilization in general

 
 

Hey i write a program about TSP ant colony optimization and i get a segmentation fault when i compile and run the code, but when i debug it through gdb the program just runs flawlessly. What am i missing ?

 

You are dealing with what is called a Heisenbug, which is a bug, usually undefined behavior, whose behavior disappears when using debugging tools.

The first thing you should do is run the program through Valgrind (valgrind ./myprogram). Ideally, you should do this on the Debug version of your program (compiler flag -g). This may provide you information on what memory errors exist in your code, and where they are in the source. Fix everything Valgrind complains about.

However, if after doing that, you're still segfaulting, and even Valgrind can't pick up on any more errors, you're in for a bigger fight.

Start by reading my popular Stack Overflow Q&A Definitive List of Common Reasons for Segmentation Faults. This will attune your programming sense to what to look out for.

(I didn't include my personal favorite in that list: lambdas returning references can cause some particularly nasty undefined behavior.)

  1. If you have an idea of when the segmentation fault occurs functionally, that can help you figure out what function(s) may be involved. If you can, try to create a Minimum Reproducable Example that has the segfault.

  2. Print off the problem area of the code on paper. Desk check it with a red pen and a pad of paper. This means you act as the compiler, running the code mentally, and noting the value of each variable. I've caught a number of bugs this way.

  3. If you're desperate, you can run the Release target of the program through Valgrind, although this will give you raw memory addresses instead of line numbers and file names. If you're very clever with a disassembler like Nemiver, and know how to read assembly code, you may be able to work backwards to isolate the problem. However, this is extremely hard; it will help a lot if you can do this with your Minimum Reproducible Example instead of the full program.

Good luck!

 

I think I kinda located the problem but i cant understand why is this happening. As you can see at the image above i for some reason decides to be whatever value it want's despite the fact that it is in a for loop.

for some reason decides to be whatever value it want's despite the fact that it is in a for loop.

This means it is reading from uninitialized memory. Common reasons for this:

  • You declared a variable, or dynamically allocated memory, but never initialized the memory with a value.

  • You are using a pointer (or reference) to either a position in memory which has already been freed (dangling pointer/reference), or which has never been allocated (wild pointer/reference). This can happen with either the heap or the stack; it's not limited to dynamic allocation.

  • You are exceeding the boundaries of an array or string (buffer overrun).

 

Have you ever worked with the Motorola 68000. I really like that CPU. In your opinion do you think assembly language is still best for super low level hardware or do you think C is on par with assembly code?

 

Ironically, I just added 68K Assembly to my list of languages to learn soon! I have a TI-89 calculator (Motorola 68000), and dearly want to play with it.

Up to this point, my assembly work has been largely limited to the X86 and X64 languages, in the context of Intel and AMD processors.

C is actually further up the stack than people think, and it isn't always the best choice for a given architecture. If you need total control, Assembly will always give that to you far and beyond any other language.

However, Assembly is also a pain in the butt (if an endearing one to certain classifications of nerds such as myself). If you have access to a higher level language that is reasonably optimized for that platform, and you don't need ultimate control, use it instead of Assembly.

In other words, "just because we can doesn't mean we should." If you can't make a reasoned argument for the language you're using, you're probably using the wrong language. :)

 

Thank you for the reply, I always value getting a second opinion. The reason I'm asking this question is because I am building a game on the Sega Genesis and I've been using A C compiler to do it.

So far it hasn't been an issue because the C compiler was built for the Sega Genesis and it has a lot of nifty features to take advantage of the hardware features such as DMA. More importantly it has sound drivers which are incredibly useful because I do not want to go around writing my own Sound Driver because I am not experienced with writing such a program.

I have recently run into a few short comings with the compiler. First and foremost being that the routines I've written in C don't seem to load as fast onto the screen as compared to Assembly.

I think I will compromise by writing my screen drawing routines in Assembly and then including them in my C code. I think that would be best for me because then I would have access to features in the C compiler as well as having access to the speed of Assembly. The problem is that I am not experienced with Assembly code. Fortunately for me, 68k assembly seems to be the easiest Assembly to learn.

By the way the C compiler I'm using is called SGDK (Sega Genesis Development Kit)

What do you think about mixing languages, is it something to be avoided?

It really depends on the languages!

There's no trouble combining C and Assembly; ultimately, C is compiled down to Assembly, at which point any Assembly code you wrote outright is just inserted in. Then, the whole thing is assembled down to binary on that particular platform.

However, you can run into varying degrees of performance issues when mixing other languages. It has to be taken on a case-by-case basis.

Bravo on making a game for Sega Genesis! Keep us posted on dev.to how that goes.

I highly recommend picking up "Game Engine Architecture" by Jason Gregory. It addresses many of the issues you're facing, and hundreds more besides, from a C and C++ perspective. He even talks about console development.

 

Ooh, this is an interesting question and reply! I've been increasingly curious about the 68k since I learned of its inclusion in the CPS1, CPS2, and 90's Macintosh systems!

I'm really curious as to how they developed games for the two former and how development was done in the latter, which I believe was mostly Pascal.

 
[deleted]
 

I'm not as well acquainted with assembly as I would like yet, but here's my first thought: are you absolutely certain of the size (in byes) of memory addresses on your machine? This looks like you're blowing past the end of program memory.

 

Yes, I'm sure. However, I made some progress:

in the original 32 bit code, I had an instruction like:

        mov $cold_start,%esi    // Initialise interpreter.      

that would work on macos (32 bit)

I found a 64 bit port somewhere that was instead using

    mov $cold_start,%rsi    // Initialise interpreter.

which is what I expected, but the apple clang assembler does not like this syntax because in 64 bit mode I have to use position independent addressing modes.

So I tried

mov cold_start(%rip),%rsi       // Initialise interpreter.  

but it seems that derefences cold_start instead of just putting the address in. Using $cold_start(%rip) gives errors.

I guess I just don't understand the apple assembler syntax esp for 64 bit code. Looking ...

 

Given that you're writing a fairly low-level, perhaps wait-free concurrent algorithm, it's possible that you have a bug that trips up about every millionth execution of the code. It's a race condition that corrupts a vital structure. Any attempt to trace upsets the condition leading to the error, thus making it go away. Your only choice is a tedious hand execution and logical reasoning.

My question is, how do you avoid throwing the computer out the window?

 

Avoid? I love tedious hand execution and logical reasoning! (No, seriously.) One of my absolute favorite things to do in programming is to print off the source, sit down with a pen, a hot beverage, a blank notebook, and a jazz soundtrack...and then spend the next hour or three just desk-checking the entire thing.

Mmmmmmmmmmmmmmmmmmmmmm, bliss. ,^

Why do you think I specialized in memory management and undefined behavior? I ADORE it!

Now, if you don't have my particular mental condition, and actually don't enjoy desk-checking for Heisenbugs, my advice is this: get off the computer. Print off the source, cozy up in your favorite chair in a relaxing environment, and desk-check it.

 

I also recommend writing unit tests that makes the race condition more likely to happen. For example, if the code normally runs with < 10 threads, test it with 1000 threads. Sometimes code is well-behaved when the data entered are far apart, so try testing with consecutive values. If it's the opposite, test with random values.

What I learned over the years is that race-freedom is not composeable: code using several mutexes incorrectly could still suffer race condition, even though a single mutex is race-free on its own. When testing wait-free algorithms, start with very small primitives and gradually add onto it. And write plenty of assert() on the non-volatile local variables of the shared volatile variables the code might be using. When assert triggers under the debugger, you'll be able to see which invariants are violated in that snapshot.

 

Hi Jason, pleased to meet you.
I'd like to present you a question about C memory management, if you were so nice. In short, it's about a program that lets you input or random generate a set of 5 numbers between 1 and 50, and 2 numbers between 1 and 12 (EuroMillions lottery). Then it can keep rolling over and over (more than 139 million times on average) until it gets the same set of numbers. The issue is that, despite using dynamic allocation and de allocation for structures, as code executes it would eventually exhaust the RAM memory until the process gets killed by the system to avoid stall.
I've tried a lighter version of the code (just 5 numbers from 1 to 48, 1.7 million loops on average) with no problems, and ensured the code is actually recycling the memory used with each pass. Any ideas of what could be wrong with it?
Thanks in advance, regards.

 

I'd really need to see some code to be able to debug this, but here's the first two things I'd look for:

  • Double check that things are actually deallocating; you'd be amazed at how often one thinks they have free'd memory when they haven't. You may be able to run it through Valgrind or another dynamic analyzer to check that. (It sounds like you've done that, though.)

  • You may have some other variable you didn't think about, either on the stack or on the heap.

If it wouldn't be too much trouble, can you put the code in a GitHub Gist or another paste bin? I might be able to catch the problem better if I read it.

 

Thank you so much for your quick and kind response Jason! I will paste the code to GitHub and share here a.s.a.p, I'm a newbie to it, as well as to many other things. Will take your advice and check out what you suggested.
We'll keep on touch.
Regards!

 

I know this question is going to come so I'm going to ask it myself: what do you think of languages like Rust?

Do you think that, in some cases, isn't a machine going to be better than a human at dealing with memory management anyway?

 

See the other question in this thread re: Rust. Long story short, it's a cool language that I haven't yet had time to learn.

In terms of man vs machine, the answer is that "computers are inherently stupid." When we are trusting the machine to manage the memory, what we're really doing is trusting someone else's code to manage the memory. In either case, some human is responsible for the memory handling logic. Therefore, it really depends on the code you're trusting!

The benefit to trusting the language's built-in memory management is that the code is almost certainly more rigorously reviewed and tested. That's where the apparent added trustworthiness comes from.

A lot of times, I will trust automatic memory management tools over my own abilities. std::unique_ptr and std::shared_ptr, for example, are excellent tools that help minimize memory mistakes (because, after all, I'm only human). However, there are times that the logic I need would become too convoluted with those magic pointer classes, so I'll resort to manual management.

It's basically a balancing act between simplicity (the more complicated the code, the more chance for bugs) and safety (reducing the chances of a memory leak). If you write really complicated code to use "memory safe" tools, you can still wind up making a royal hash of it, when a simple pointer would have meant 80% less logic, and thus prevented those issues.

 

Thanks for the detailed explanation!

I wonder if in the future there will be attempts to introduce AIs into managed memory systems, to increase what you call "apparent added trustworthiness".

 

Hello, Jason

I work in a house code that is able to simulate fluid dynamics. Nowadays it has four hundred thousand of lines. An important piece of the code uses PETSc library that can solve linear system. The code uses MPI for parallel communication, C language and Fortran, but most of them are in Fortran90. After new version of GCC (>8.0.0), the code started presented a memory leak, when the Petsc library is active. I tried to use Valgrind and DrMemory to get that leak, but I still could not find the problem. I'm quiet sure that the problem is not in the Petsc Library, but in the way that I communicate the global matrix. I noticed that you are expertise in memory leak and perhaps you can see the valgrind log and help me to detect where the memory leak is happening.

In valgrind log there is a Lot of information, but all of them point to the MPI library or HDF5 library, but any one points to the f90 files or Petsc Library. In that situation is possible to get the memory leak?

the way that I run with valgrind:
mpirun -n 2 valgrind -v --leak-check=full --show-leak-kinds=all --log-file=valgrind%p.log --track-origins=yes ./amr3d

Thanks in advance
Millena

 

Hi Millena,

Of course, without seeing the Valgrind log itself, it's hard to say. (If you do post that, please use a GitHub Gist, a Hastebin, or something of the sort.)


First, some technical background. Apologies if you already know some/all of this. It's also for other readers:

Memory leaks can be an issue, but they aren't necessarily a sign that something is wrong. In some particularly complex programs or libraries, it is impractical to manually free all of the allocated memory at the time of program termination, so it is acceptable to just allow that memory to be freed as a part of the entire program's stack and heap being released. So, in that sense, it is perfectly possible that a memory safe program can report memory leaks.

However, as you know, a memory leak can become an issue if it is occurring during the program's lifetime, instead of just before termination.

In any case, a memory leak always has the same cause: memory is being dynamically allocated, but not freed before the last pointer to it goes out of scope. Thus, it becomes virtually impossible to free said allocated memory, so it can never be reused during the life of the program. If this happens enough times, you can actually run out of heap space.

One more issue that makes this difficult is that a memory leak can occur in one place, but be caused by usage elsewhere. For example, you might call a library function that allocates memory, but not realize you must call another library function to deallocate it. That's far more likely to occur in C, instead of C++ or FORTRAN, due to C's lack of objects and their constructors/destructors for handling automatic allocation and deallocation related to the lifetime of an object. That's why the entire stack trace is so important.


Here are my initial thoughts on actually tracking this down.

First, I find it interesting that the problem only occurs after you start using a new version of GCC. This does not necessarily mean there's a bug in the compiler, however. Memory-related bugs have a tendency to develop strange properties, such as Heisenbugs or Schoredinbugs. There are many more such freaks of nature besides.

I wonder if the conditions for a memory leak have been present in either your project's source or the library source for some time, but that particular implementation details (or another bug?) in previous versions of GCC concealed its existence. Once the behavior of GCC or the standard library were changed in the latest version, possibly fixing a bug, the memory leak was no longer being coincidentally diked out. In fact, I'd wager this is the most likely possibility.

I'd also suspect that the memory leak itself actually is in the Petsc library, given that it must be involved for the leak to occur. It may even be possible that the latent bug existed in Petsc. However, the cause of the memory leak quite possibly originates from your source; your particular usage of the library may be triggering some sort of peculiar corner case in Petsc, wherein the bug resides.

The other possibility is that GCC 8 has a bug itself, but given the size and domain of your program and its libraries, that would be quite difficult to isolate.


And now the bad news: tracking this down is probably non-trivial. If you compile both the library and the source with -g, Valgrind should give you line numbers you can use to check the source. You'll need to work backwards to figure out what's wrong.

That would be the easy solution, and hopefully it's as far as you need to go.

I would also recommend testing your code against the LLVM Clang compiler, if possible. Does the same error occur there? If it does, be glad! You need only pick apart your code and that of Petsc to find the problem.

Otherwise, if GCC 8 does prove to be a necessary environmental factor for the memory leak to occur, and you cannot isolate the problem any other way, another rather involved thing you can do is to perform a bisect on compiler versions.

  1. Spin up a clean environment, such as in a virtual machine.

  2. Build your libraries and source with the last compiler version you remember working. Ensure the memory leak is not present.

  3. Build with the compiler version you know isn't working. Verify that the memory leak is present. (If it isn't, you can rule out compiler; you're now probably dealing with a phase-of-the-moon bug, which will require you figure out what on your development machine is causing the issue.)

  4. Assuming 2 and 3 have the expected outcomes, use those two compiler versions as your endpoints for a bisect. Check compiler versions in between, following the same workflow as a git bisect, until you know the first version that presents a problem.

  5. If you have the time and access, consider bisecting on the development versions leading up to the first version of the compiler that presents the issue.

  6. Check the changelogs for the version. If you did step 5, look through the commit messages. Try to isolate the change to the compiler that is contributing to the memory leak.

If you do all this, remember, this might not be an actual bug in the compiler. If you can determine what caused the behavior to change in the library, you may be able to find the latent bug in Petsc.


I hope that helps!

 

Hello, Jason
First of all thanks a lot for you opinion and advices!. I will try to perform step by step what you suggested above. In sequence there is a link where the valgrind log is. As the tests are being performed I would like to share with you the results. Can I have a contact with you by email?

gist.github.com/mmvillar/ca0a726a4...

again, Thank you!

Sure, my contact info is on my personal website. Link is on my DEV profile. I can't guarantee that I'll be able to solve this remotely, as you know the code far better than I could hope to, but I'll help where I can.

EDIT: Looked at the log. Yeah, you'll definitely need to compile your dependencies with -g, in order to have all the information you need.

 

Oh what a small world. I too am an expert in segfaults. Nary a day goes by without my code segfaulting...

Sorry, obvious joke that I didn't see anyone else make. Thank you for doing this, it's very insightful.

 

That's how you become an expert at solving them. ;)

 

Hi there, I am pleasantly surprised to discover this site.

Some people using my code had segmentation fault, I'm looking for a way to generate console output these people can copy and send me, so I can track down the issue.

When SEGFAULT happens, game is over. I'd like to capture this, inspect some variables and cout them before exiting, something like try catch. But I believe cout after SEGFAULT leads into undefined behaviour, so...

Any suggestion? Thank you.

 

Unfortunately, it is not possible to "catch" a segfault, nor continue program behavior safely (if at all) after it has been raised. Therefore, you have to take the opposite approach, and log everything that happened leading up to the segfault.

You can also have your tester describe (or screen record, especially if it's a game) what happened leading up to the segmentation fault. Then, you should be able to replicate that on your own machine.

Mind you, "replication" won't necessarily mean you can recreate the segmentation fault itself, since it's one of an infinite number of possible behaviors in response to some illegal memory action your code is taking (ergo "it is legal for the compiler to make demons fly out of your nose"). That's what it meant by undefined behavior. However, by replicating the same steps as your tester while running the Debug build of the application (compiled with -g) under Valgrind, you should be able to catch the problem.

There is also a more proactive approach you can take, especially if you're using C++: modernize your code base. Refactor the code - by hand mind you, NOT by using find-and-replace or some other automated tool - to make use of smart pointers like std::unique_ptr and std::shared_ptr instead of raw pointers, new, and delete. This will eliminate most memory errors, since the smart pointers handle object lifetime and whatnot (formally known as RAII). Refactoring is not a "quick fix", but it's the most resilient fix.

 

Thank you very much. You confirmed my approach is right: cout everything!

In my specific case, my code is appended to third party code where the segfault happens, so it's hard to trace and I don't even have the chance to fix.

At the risk of self-promotion, I wrote something called IOChannel which is designed to better control cout-style logging, based on category and priority. You can also route messages to different places, including to functions that will write them out to a file instead of printing them to the console. It's part of PawLIB, which is still in development, but 1.0 is stable. (Yes, totally open source)

 

About Garbages Collectors (not mentioned so far !) i think that they are nice but should not be a default management so rather optional. Once a GC handles the allocations it's very hard to use another alternative management on top because GCs tend to free manually allocated resources since they think they are not used anymore, i.e not used by a root memory block.

What's your favorite management technique: manual, ref counting or GC ?
Do you share my point of view on ?

 

Since I use C++ primarily, and it doesn't have a built-in garbage collector by default, I've just formed the habits of handling everything myself. Those habits and instincts carry over to other languages that do have GCs, but I will still manually free things as far as I'm allowed.

In a broader sense, I don't generally trust generic abstractions to do my work for me. If I'm not sure what's needed, I'll leave it to the automatic systems, but try to understand what's happening under the hood. If I know for certain what needs to happen, I'll do it myself, and let the automatic systems do mop-up work behind me in case I miss anything.

In the same way, I never let the compiler define constructors or destructors for me. Every (non-static) class I write has, in the least, explicitly empty constructors. Knowing how my coding adventures usually go, the one time I trust the compiler to define the destructor, it'd hit an edge-case and bork. So, I don't leave much room for that kind of madness.

Ironically, the above is probably in part my Python background talking: "explicit is better than implicit".

Now, with that said, one should know all the automatic tools their language offers, and how to use (and not to use) them. Doing things manually is not an excuse for ignorance. All the above does not preclude me from using such bits of magic as std::unique_ptr, which handles its own deallocation via a GC. I simply make an informed decision on whether to do it myself, or to use a tool that specifically matches the use case.


By the way, in terms of ref counting, I am reminded of a classic AI Koan...

One day a student came to Moon and said: “I understand how to make a better garbage collector. We must keep a reference count of the pointers to each cons.”

Moon patiently told the student the following story:

“One day a student came to Moon and said: ‘I understand how to make a better garbage collector...

 

Hello , If you could help me , it's a school project the thing is , I get segfault when I run and type a big string as an input now the real thing is I have a buffer big enought and when I run with valgrind I get no seg faults do you have any idea?

 

I could probably help, but I'd need to see your code to do so. Can you create a Github Gist?

 

Hello, Jason!

I'm trying to recreate the ls command, but catch the segmentation fault when trying to print recursively ls -R / command. I'm checking if I have permission to open the folder and it's working pretty nice. Each time it reaches different depth of file tree. Could you, please, give a piece of advice how to handle it and what can cause the segmentation. Thanks in advance!

 

The first rule of debugging: "If it's weird, it's memory."

You have to remember, a segmentation fault is just a form of undefined behavior. You can use a dynamic analyzer, such as Valgrind, to dial in on the exact part of the your code with the problem.

Look especially for the following "hot spots" for undefined behavior:

  • Dereferencing pointers.
  • Dynamic allocation.
  • Passing by or returning a reference.
  • Traversing array-like structures, such as C-strings.

A while back, I wrote up a big list of common reasons for segmentation faults. It might be helpful.

 

C is my favorite language, but I have never worked with it professionally. I've completed "Learn C the Hard Way".

What's another challenging text for a professional developer who is a C dilettante?

 

I've got a few on my shelf I enjoy in that category:

  • Game Programming Patterns by Robert Nystrom discusses many patterns, including a number related to dynamic allocation and memory management, from a game development perspective. (Written mainly for C++, although you could take on the challenge of implementing the patterns in C!) Besides that, his comical, bantering style makes for a really fun read.

  • Hacker's Delight by Henry S. Warren contains a number of mind-bending algorithms that operate in C and Assembly.

  • Game Engine Architecture by Jason Gregory explores the myriad of challenges that game engine developers face, especially issues of performance and memory management. Again, this written primary for C++, but you can approach many of the problems from a C perspective as well.

  • The Art of Computer Programming by Donald Knuth. Okay, I don't own this one, but I really really want a copy! It's quite a challenge to wrap your head around his algorithms and patterns, many of which are fundamental to the field of computer science.

 

what’s the simplest way to cause memory leak ?

 

Fail to free memory after you allocate it, and then destroy and the pointer.

My return question is, why would you want to? ;-)

 

Ironically, you of all people should have a very easy answer to your own question -- 'Why would you want to?' -- To learn more about what went wrong? To study it, to understand it, and to prevent it from happening again in the future. Sometimes, the best way of learning is by knowingly doing something 'silly' -- I wouldn't call it stupid because you are doing it with the expectation, which means you are preparing for it. All good engineers try to reach this state -- be aware of what can go wrong, and how to handle it.

Yes, that's fair, in learning. But, honestly, 99% of my learning comes from just trying to do hard things, and working with the failures as they come. Those are far more practical and effective to learn from than any sort of deliberately manufactured mistake.

 

What is your preferred method to return from a segfault?

i.e. define

do_segfault()
 

If I understand your question right...

A segfault is the best possible behavior that you can get given undefined behavior, because it's a specific runtime error that you can probe. You're actually not guaranteed to get a segfault when your code has undefined behavior.

Thus, it is both technically impossible and entirely unwise to "recover" from a segfault. Let the program crash (did we have a choice?), figure out what in your code is undefined behavior, and fix it.

To put that another way, because a segmentation fault is a runtime error, and one that isn't guaranteed anyhow, it's immune to try-catch statements and error handling.

 
 

How to create segmentation fault using malloc?

 

Segmentation faults are just a possible outcome of undefined behavior, which is unpredictable by nature. They are not like exceptions! There is no official or reliable way to "throw" a segmentation fault, as it only occurs with undefined behavior, wherein "it is legal for the compiler to make demons fly out of your nose". No matter how you "cause" a segmentation fault, there's always a chance it will do something else, or even somehow appear to produce valid code instead!

Meanwhile, malloc doesn't tend to have undefined behavior, unless you're dealing with heap corruption (a very bad thing), wherein you somehow wrote into unallocated memory. If that's the case, your problem isn't the malloc call, but rather something elsewhere. Barring problems from elsewhere in the code, malloc will only do one of two things: return a pointer to allocated space, or return a null pointer because it couldn't allocate. Simple as that. :)

 

Any advice for complete new coding beginners please would appreciate it?

 

Sure.

Don't mess with manual memory management...yet.

It's very, very easy to proverbially blow a limb off with manual memory management. Get skilled with the fundamentals of programming first, and establish habits that allow you to write clean, stable code in a higher level language.

Once you can write a few hundred lines of code in, say, Python or Java, and have them work right on the first or second attempt, then dive into more advanced concepts. Manual memory management is something that's easy to get wrong, so you need to first have an established track record with yourself of getting things right.

Before I ever touched C++ and memory management, I had written a couple of reasonably stable, small applications, and had actually implemented a programming language in ActionScript 3.0 with regular expressions. (I don't recommend the latter; it was a great challenge, and it worked great for the purpose it was designed, but it pretty well sucked in terms of performance.)

With all that under my belt, I was able to start using C++. Even then, I avoided manual allocation whenever possible, using memory-safe tools and methods first. Once I was experienced with those, I started doing more and more manual allocation and raw pointer arithmetic. I made a lot of mistakes at the start, but that's how we learn best!

 

What are some of the common issues you find when working in memory management? Do you have one way of working through them or multiple?

 

I do go through a little list in my mind:

  • Do my allocations and frees match? (malloc and its cousins with free, new with delete, new[] with delete[]).
  • Do I null out my pointers immediately after freeing?
  • Do I check if a pointer is null before using it?
  • Does my pointer math have any edge cases?
  • Are my iteration loops all safe? (I always get uneasy around while loops that touch allocated memory.)
  • Do my recursive functions have explicit stop conditions?
  • Are my C-strings (if any) null terminated?
  • Are my destructors properly freeing allocated memory?
  • Do I have tests for all major functionalities?
  • Are all my tests running Valgrind-pure?
 

Can I or can I not return and use an address that does not point to the start of a dynamically allocated number of bytes but somewhere inside the allocated memory?

 

Yes. As long as the pointer points to a sector of memory which is within in allocated region of the program's heap memory, it is valid, and can be accessed and used.

However, be aware that there can be unpredictable problems if you read from uninitialized memory — memory which has been allocated, but no value assigned to it.

code of conduct - report abuse