In this article I will show you how to perform language identification on tweets and how to stream the results to a webpage and display a real time visualization. We will use my open source project nlphose and C3.js to create this visualization in minutes and without writing any Python code !
To run this example you need following software
- ngrok (optional, not required if your OS has GUI)
- Internet Browser
Run the below code at shell/command prompt. It pulls the latest nlphose docker image from Docker hub. After running the command, it should start 'bash' inside the container
docker run --rm -it -p 3000:3000 code2k13/nlphose:latest
Copy paste the below command inside the container's shell prompt. It will start the nlphose pipeline
twint -s "netflix" |\ ./twint2json.py |\ ./lang.py |\ jq -c '[.id,.lang]' |\ ./ws.js
The above code collects tweets from twitter containing the term "netflix" using the 'twint' command and performs language identification on it. Then it streams the output using a socket.io server. For more details, please refer to wiki of my project. You can also create these commands graphically, as shown below using the NlpHose Pipeline Builder tool
If you are running this pipeline on a headless server (no browser), you can expose port 3000 of your host machine over the internet using ngrok.
./ngrok http 3000
Download this html file from my GitHub repo. Edit the file and update the following line in the file with ngrok url
var endpointUrl = "https://your_ngrok_url"
If you are not using ngrok, and have browser installed on your system (which is running the docker container), simply change the line to:
var endpointUrl = "http://localhost:3000"
You will need to run this file from a local webserver (example http-server or python -m http.server 8080).Once you run it, you should see a webpage like the one shown below:
That's it !! Hope you enjoyed this article !