lukaszkuczynski

Posted on Nov 30, 2018 • Originally published at lukaszkuczynski.github.io on Nov 30, 2018

AI days

#conference #ai #machinelearning

Intro

The following is the report about the event that took place Nov 29th, this year, inside the Volvo. We could listen to many talks concerning the AI revolution in both world and local scope. We are already using AI. Where and what tools do we use? Let me explain.

AI application, trends

There are some misconceptions about what AI is and how different it is from Machine Learning. Thanks to Jair Ribeiro we could learn that AI is currently a popular trend, and we have to face it. In Volvo, there are some use-cases with the famous Vera vehicle and other projects using data to draw conclusions and to be informed.

Chatbots

Virtual assistant: this is a topic which becomes more popular recently and it is surely not a dream! Whenever you visit some website and there is a nice chat window popping out at the bottom right of your browser – it is probably a machine talking to you. In Volvo, there is a use-case of a chatbot in the ticketing system. We were shown by Singh Kumar the chatbot that helps you with internal knowledge database, helping you to choose the correct person or team to help you with your problem. It can also answer you directly if he knows the answer. There are many possible challenges we can face when installing a system like this one. F.e. you have to teach your chatbot when it should give up and say “I don’t know”.

AI Mindful?

Thanks to Patrick Kozakiewicz we could have a moment of reflection on what kind of person we are and how does it affect the software we produce. Of course, AI is another type of software and it was interesting to learn that it could be biased by the way it was produced. There were some tricky questions about the “bad AI” and of course it can happen if we fed AI with a bad content. Should we teach AI our normal, human behaviors? I believe we should. And we all should be aware of the fact that when we are under stress, our productivity is worse, so take care of your surroundings.

Data collection

The very important and the most time-consuming part of a machine learning process is the preparation phase. As Bartek Starościak explained, we do have the project of an improvement of a production process. We want to predict which features have an impact on the future vehicles failures. To collect all the data you need to face many kinds of inputs. Apart from nicely formatted files, we do have some hand-written notebooks and the variety of different systems and databases.

Qlik

Having lots of data you will sooner or later face the issue of choosing what visualization is the best for explaining your data to a business person. As told by Tiago Hubner and his friends we can leverage QlikSense for that purpose. There are lots of features of that platform but one really caught my attention: Insights. It is a nice feature suggesting you what you can find in your data and how can you visualize it.

Black box

There are some machine learning algorithms that are pretty hard to explain. When the model explanation is important? As Bartosz Kurlej explained, there are some business areas when there is a need for a detailed explanation. F.e. if you want to prove the bank management that someone should not be given a loan, it is better to explain what the reasons are for that. To make it happen we can use Lime algorithm.

Tools

Azure and Databricks

I cannot imagine AI without tools. At some point of the time, you will face the following challenge: your local machine is not powerful enough. Training models require a lot of the computational power which – the more likely – you do not have. What should I do then? Use clusters and make your job distributed. It is far easier to do find lots of regular computers than just one big machine with a big processor.

Damian Kowalczyk was explaining what is the Microsoft’s response toward AI and how can we use Azure to do our AI stuff. Azure really has a wide variety of tools that can be used, you can find:

trained models ready to be used like computer vision models or speech models
frameworks, f.e. Tensorflow, Keras
services: ready to use, f.e. Databricks (my favorite)
infrastructure: you don’t care about how it is running
deployment: your model can be easily packaged in a container and exposed as a web service

I am particularly interested in the last element. When developing the model sooner or later you would like to share with others the great results of your work. We could also learn about Spark on Azure so Databrick platform. Running your data processing in a distributed environment is not so old because everything started to happen around 15 yrs ago when Google published its papers about MapReduce (2004). Then things changed dramatically when Spark was introduced as it was different from its precursor: Hadoop. Spark can do its job fully in-memory so it makes its computation more powerful. And this is why Databricks offers easy access to this technology using data-science standard: Jupyter notebooks.

NLP

Maciej Szymczak provided an introduction to NLP for developers. I am particularly interested in NLP lately, so I was listening intensively and I can assure he did a great job. I like the way he explained NLP for developers so it was led in such a level that anyone who starts playing with text could be encouraged. We were able to see the live demo of a sentiment analysis using the standard Bag-of-Words technique. But it is not everything, we could learn the current trends of NLP with the following names as spaCy or gensim to be used in the modern NLP projects. Obviously, Bag of Words is not perfect, and we have alternatives to that, so there was mentioned Word embeddings with a continuous version of Bag of Words. Thus, we can avoid skipping the importance of the context of words we analyze.

Advanced analytics

Is advanced analytics something that your business needs? This is the question Fabio Bezerra answered. We live in a world of big data. And sooner or later we have to face it. There are other companies making good use of that, so we have to follow up. The analytics of a big data has several forms. We have descriptive analytics when we do want to make our data more explainable. There is also prescriptive analytics when I am analyzing my data to make it useful in making a decision later on. What are the Business Intelligence tools making that analysis possible? We could learn about some of them, including Excel, Qliksense and Knime.

DEV Community