DEV Community

Cover image for Things I learned to appreciate in graduate school: abstraction
theohartsook
theohartsook

Posted on

Things I learned to appreciate in graduate school: abstraction

I finished my masters degree over the summer and started a PhD program right after. I have always felt a little bit of the odd one out because my background is in environmental science and GIS rather than computer science or math. However, programming was a big part of my masters and will continue to be in my PhD. I also dabbled with it in my career before going back to school. I would like to share some of what I learned over that time, because I feel that some concepts I struggled to learn in class really began to make sense once I started writing semi-serious code.

I never really got the point of abstraction before. I saw the obvious benefit of being able to use print() and other handy functions, but I never really thought about while writing scripts. Once I started working on a LiDAR processing pipeline, I really struggled. The pipelines are long and complicated, with some steps requiring GUIs, some steps easy to automate, and many steps that need to be done manually first to find good parameters, which can then be automated.

This took me one semester to implement, and another semester to get comfortable with. It took a lot of meetings with advisors, mentors, and kind strangers, but eventually I got us a working pipeline. Then I needed to do the tough job of explaining how and why to use this pipeline.

In my experience LiDAR is really tricky to get started with for people of all kinds of backgrounds. For people in the natural science domain, such as me, it's much more math and computer science heavy than we are used to. For people in the computer science domains, it seems to be the geodetic elements. On top of all this, there is the logistical challenge of file formats, vendor specific software, identifying noise, and so on. There is a lot to explain and it is easy to get confused.

In order to explain a pipeline like this, I really had to start using abstraction to describe it at different levels. This made me think a lot about the difference between the general overview and the implementation level.

This is an example of the high level overview I would give.
The simple version

And for this example, I zoomed in on specific step (getting the data from the scanner) and included some of the decisions that need to be made.
The complicated version

This requires a lot of decision making and experience really makes a difference here. The process can be automated, but it requires a lot of parameters to be decided. These parameters vary depending on the type and platform of scanner, wavelength of the laser, the weather, the number of scanning positions... It takes a lot of work to get centimeter or millimeter level accuracy!

Obviously this is a lot to explain to people, and it's a lot of details that people don't necessarily need to know right away. Especially if someone is just curious about what LiDAR can do and if it would be applicable for their work. This pipeline can be described at multiple levels of abstraction, so it's up to me to pick the best one for my audience at that time. I often want to go into the details, but "I used PDAL to prepare these rasters" is appropriate.

Discussion (0)