# Analyzing sugar in McDonald's menu's items

Recently I've found a data set in Kaggle which is composed with the nutrition facts of every item in McDonald's Menu (Dataset).

I know for a while that the consumption of sugar in our life is way upon the one recommended by the health care organizations, so I wanted to analyzed how much sugar where in the McDonald menu's items and which of them don't have any kind of added sugar.

For this I've used a Jupyter Notebook with the following libraries: plotly and pandas. Here I'm going to explain the different steps I've followed, but the complete notebook could be checked in my Github Repository.

The information is in a cvs. First let's load the information to see how it is structured:

``````menu = pd.read_csv('./menu.csv')
``````

With this we could see the information of the dataset. This means, the columns and the rows.

Ok, the one I'm interested in is sugar, so I'm going to create a new pandas data frame composed by the column with the item's name and the amount of sugar, and Aldo I'm going to order them in an increasing order:

``````df_sugars = pd.DataFrame(columns=('Item','Sugars'))
print("Let's sort them by the amount of sugar they have in a ascending order: ")
df_sugars = df_sugars.sort_values('Sugars', ascending=[True])
``````

So now that I have this, I want to check which are the menu items that don't have any amount of sugar:

``````print("Number of items in the menu: "+str(len(menu.index)))
print("Number of items without sugar in the menu: "+str(len(df_sugars.loc[df_sugars['Sugars'] == 0])))
print(df_sugars.loc[df_sugars['Sugars'] == 0])
``````

And I obtain the following result:

``````Number of items in the menu: 260
Number of items without sugar in the menu: 25
Item  Sugars
145                Coffee (Small)       0
99              Kids French Fries       0
96             Small French Fries       0
81   Chicken McNuggets (20 piece)       0
114             Diet Coke (Small)       0
115            Diet Coke (Medium)       0
116             Diet Coke (Large)       0
117             Diet Coke (Child)       0
122        Diet Dr Pepper (Small)       0
123       Diet Dr Pepper (Medium)       0
124        Diet Dr Pepper (Large)       0
98             Large French Fries       0
80   Chicken McNuggets (10 piece)       0
79    Chicken McNuggets (6 piece)       0
136           Dasani Water Bottle       0
137              Iced Tea (Small)       0
138             Iced Tea (Medium)       0
139              Iced Tea (Large)       0
140              Iced Tea (Child)       0
78    Chicken McNuggets (4 piece)       0
146               Coffee (Medium)       0
38                     Hash Brown       0
147                Coffee (Large)       0
125        Diet Dr Pepper (Child)       0
97            Medium French Fries       0
``````

So only 25 elements of 260, which means that only the 9.61% of the items in McDonalds doesn't have any amount of sugar. Now, let's do the plot to see this graphically, for this I'm going to use the Plotly library:

``````print("Let's start with the bar chart")

data = [go.Bar(
y = df_sugars['Sugars'].values,
x = df_sugars['Item'].values,
)]

py.iplot(data, filename='basic-bar')
`````` Also, I'm going to plot a scatter plot:

``````# Now let's plot a scatter plot
# This plot is based on the one made by Anisotropic:
# https://www.kaggle.com/arthurtok/super-sized-we-mcdonald-s-nutritional-metrics

trace = go.Scatter(
y = df_sugars['Sugars'].values,
x = df_sugars['Item'].values,
mode='markers',
marker=dict(
size= df_sugars['Sugars'].values,
#color = np.random.randn(500), #set color equal to a variable
color = df_sugars['Sugars'].values,
colorscale='Portland',
showscale=True
),
)
data = [trace]

layout= go.Layout(
autosize= True,
title= 'Scatter plot of Sugars per Item on the Menu',
hovermode= 'closest',
xaxis=dict(
showgrid=False,
zeroline=False,
showline=False
),
yaxis=dict(
title= 'Sugars(g)',
ticklen= 5,
gridwidth= 2,
showgrid=False,
zeroline=False,
showline=False
),
showlegend= False
)
fig = go.Figure(data=data, layout=layout)
py.iplot(fig,filename='scatterChol')
`````` The OMS tell that the max amount of sugar per day should be 50g. Let's see the items of the menu go over this threshold:

``````# First let's add a new column to the dataframe, all equal to 50
df_sugars['Amount of Sugar recommended (g)'] = 50

# Let's plot them

trace1 = go.Bar(
y = df_sugars['Sugars'].values,
x = df_sugars['Item'].values,
name='Sugars(g)'
)
trace2 = go.Bar(
y = df_sugars['Amount of Sugar recommended (g)'].values,
x = df_sugars['Item'].values,
name='Recommended value of sugar OMS (g)'
)

data = [trace1, trace2]
layout = go.Layout(
barmode='group'
)

layout= go.Layout(
autosize= True,
title= 'Relation between OMSs recommendation and  Sugars per Item on the Menu',
hovermode= 'closest',
xaxis=dict(
showgrid=False,
zeroline=False,
showline=False
),
yaxis=dict(
title= 'Sugars(g)',
ticklen= 5,
gridwidth= 2,
showgrid=False,
zeroline=False,
showline=False
),
showlegend= False
)
fig = go.Figure(data=data, layout=layout)
graph = py.iplot(fig, filename='grouped-bar')
`````` So as you could see, there's a lot of items in the menu that are bad for our health
for not saying all of them. For seeing the items in a more detailed way, you could check the notebook because the plots are interactive.

I hope you like this short analysis I've made. Check the repository in Github! :) 