DEV Community

Michael Iodice
Michael Iodice

Posted on

Ruby CLI Scraping Trail Selectors

The Journey

The journey through the first couple of weeks of Flatiron's Software Engineering course was tough at times, and very rewarding at others. Phase One's projects requirements were to find a website with somewhat structured data, that would lend itself to scraping and offer a consistent second page for "in depth" information to return. In order to structure our scraped data, we would need to have at least one "has many" class relationship structure within this data. The user would not need to know what was going on behind the scenes, and be presented with a nice CLI program to interact with so they could make selections.

For my project, I decided to use a website that featured the 100 top trails in America. I opted to configure my "has many" class relationships between trials and states, one state having many trails and each trail having one state.

What I loved

Of the material so far, the project was perhaps the most fun for me. My background in coding comes from replicating a product that was way too expensive (motion control equipment for camera time-lapses) or for products that I simply could not find, like my 'motivation box' project that opens upon completion of a to-do list android application. Being dropped in an empty code editor, the process of scratching out a flow diagram, identifying which tools would need to be implemented for the correct functionality and building each component one at a time felt very familiar. I enjoyed the stage of 'architecting' the program and trying to anticipate all the structural issues and concerns that I'd need to take into account when selecting methods.

Scraping proved to be the most time consuming part of the build. Having very limited experience with HTML, I found myself 'getting lucky' while picking selectors to hunt down my data fields. From here, I really enjoyed the step by step functionality building and then carefully analyzing which class was the most responsible to run the piece of code. Sometimes that was very obvious, other times not so much. I ended up using the main CLI file (as it guides the user through the program) to take user inputs and call on other class's methods as much as possible. From there, the classes focus on handing data back and forth between themselves as much as possible, only returning said data back to the CLI when it needed to be presented to the user.

The next big focus was taking care to eliminate all forms of variable storage possible within each class, and ensure that the CLI was calling class methods that would directly return the variables of each class instance. This is a very important process as it results in retrieving the 'primary source' of data, and not something that was previously stored in a hash, array, or variable. The requirements of constantly asking an entire class for specific pieces of data really provided reinforcement of the class & relationships between them. I also had fun building a separate validation class, as to make sure the user would be unable to cause the program to error out by entering the wrong inputs.

If I were to do the project over again, I'd have built another class which would have the ability to filter by activity allowed at the trails. The website featured bicycling, hiking, horseback riding, and a couple of other options which varied for each trail. I thought it would have been really neat to be able to select by activity type, and perhaps enter in a number of different states, and show the user the results that matched their inputs. After finishing my core functionality, I ended up on Zoom helping others with their questions and problems. Seeing them start to connect the dots and making new friends was extraordinarily rewarding. Looking back, it's hard to say if I would have learned more by implementing the new class, or by explaining concepts and looking at other students' code. Either way, I consider the first project a success and am feeling excited to head into my project review and share my work.

Top comments (0)