DEV Community

Cover image for I built Hippotable for in-browser data analysis
Vladimir Klepov
Vladimir Klepov

Posted on

I built Hippotable for in-browser data analysis

I'm happy to announce the first public release of Hippotable — a tool that lets you analyze data without leaving your browser, on desktop & mobile.

I often analyze small- to mid-sized datasets for work and for fun — e.g. to find out the distribution of a certain bug by platform, or calculate unique affected users. But what tools do I have to help me here?

  1. Bash lets you uniq | wc -l — handy, but making advanced pipelines is hard.
  2. Google sheets does the job, but struggles above 10K rows due to all the cruft, and using it for sensitive data such as personal budgets or user data is a no-no.
  3. Python + jupyter + pandas is up to any data problem, but it's overkill for my simplistic use cases, and requires a lot of code.

So I set out to build a simple browser-based tool to do the job. Hippotable can:

  • Open CSV files up to 100 Mb in size.
  • Scroll though thousands of rows.
  • Filter and sort your data in real time.
  • Aggregate / groupby data to gain deeper insights.
  • 🏗️ Build powerful data pipelines with multiple filter / aggregate steps.
  • Share results with CSV export.

It's also free and open source.

Example

Now, let me walk you through an example of analyzing an annotated movie dataset from kaggle. Let's start simple and see which countries, on average, make the best movies. Group by country, sort by average rating:

Image description

Hm, this looks like a selection of countries which happened to co-produce a decent film once, not that interesting. Let's try again, removing countries that have <10 movies:

Image description

Now that's unexpected! In case you're curious, lots and lots of bad films come from Italy:

Image description

Combining multiple filter and aggregation layers enables really powerful processing pipelines. For example, here are countries that were home to most great directors (see, not all is lost for Italy):

Image description


That's it for today! Give hippotable a try and star on GitHub to help spread the word. Join me next time to learn about the amazing tech I used to make this happen.

Top comments (0)