Application Programming Interfaces¶
Task 1¶
wake a response to the Mensa-Api for the timeframe today + 1 Days and today + 7 Days and extract the current Data for meals. Save your data in the Variable meals.
Get today
:
today = date.today()
today
datetime.date(2023, 12, 22)
Define URL
:
API_URL = "https://sls.api.stw-on.de/v1/location/101/menu/{}/{}".format(
today + timedelta(days = 1), today + timedelta(days = 7))
API_URL
'https://sls.api.stw-on.de/v1/location/101/menu/2023-12-23/2023-12-29'
Wake a response and convert to json:
resp = requests.get(API_URL)
resp_json = resp.json()
Extract Meals Data:
meals = resp_json.get("meals")
meals[0]['price']
{'student': '2.05', 'employee': '3.75', 'guest': '4.85'}
Now convert the list of meals into a Pandas DataFrame object and store it in df_meals
. Make sure that the date column has the datetime data
type and that the prices are in a numeric (float
) format.
Hint: You might want to check the pandas.json_normalize
function.
df_meals = pd.json_normalize(meals)
df_meals["date"] = pd.to_datetime(df_meals["date"])
df_meals["price.student"] = pd.to_numeric(df_meals["price.student"])
df_meals.head(2)
id | date | name | name_en | time | special_tags | price.student | price.employee | price.guest | nutritional_values._NOTE | ... | location.address.zip | location.address.city | location.opening_hours | lane.id | lane.name | lane.name_en | tags.categories | tags.allergens | tags.additives | tags.special | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 50181 | 2023-12-18 | Paprikasuppe | Bell pepper cream soup | noon | [Deprecated. Use tags→special instead.] | 2.05 | 3.75 | 4.85 | WARNING: These fields currently contain incorr... | ... | 38106 | Braunschweig | [{'time': 'noon', 'start_day': 1, 'end_day': 4... | 10 | Suppe & Co. | Soup & Co. | [{'id': 'VEGA', 'name': 'Vegan', 'name_en': 'v... | [{'id': 'SO', 'name': 'enthält Soja(bohnen)u d... | [{'id': '2', 'name': 'mit Konservierungsstoff'... | [] |
1 | 50183 | 2023-12-18 | Bulgurpfanne mit Wildkräutern | Kräuterjoghurt... | Bulgur with wild herbs and courgettes | Herb y... | noon | [Deprecated. Use tags→special instead.] | 2.50 | 5.80 | 6.90 | WARNING: These fields currently contain incorr... | ... | 38106 | Braunschweig | [{'time': 'noon', 'start_day': 1, 'end_day': 4... | 20 | Classic 1 | Classic 1 | [{'id': 'VEGT', 'name': 'Vegetarisch', 'name_e... | [{'id': 'ML', 'name': 'enthält Milch u Milcher... | [] | [] |
2 rows × 33 columns
Next, create a simple plot that plots the student prices over time. You can use a scatter plot to do this.
plt.scatter(df_meals["date"], df_meals["price.student"])
plt.show()
Task 2¶
Next, make a request to the Mensa API to get all meals in the time frame December 1, 2023 to December 21, 2023.
After requesting the data, Analyze it by their mean price Distribution, and a thing you find interesting to analyze. (Be Creative)
What can you observe?
API_URL = "https://sls.api.stw-on.de/v1/location/101/menu/2023-12-01/2023-12-21"
resp = requests.get(API_URL)
data = resp.json()
meals=data.get("meals")
len(meals)
40
df_meals=pd.json_normalize(meals, max_level=1)
df_meals=df_meals.astype({'price.student': 'float',"price.employee":"float", "price.guest":"float"})
df_meals["date"]=df_meals["date"].astype('datetime64[ns]')
df_meals.head(1)
id | date | name | name_en | time | special_tags | price.student | price.employee | price.guest | nutritional_values._NOTE | ... | location.name | location.address | location.opening_hours | lane.id | lane.name | lane.name_en | tags.categories | tags.allergens | tags.additives | tags.special | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 50181 | 2023-12-18 | Paprikasuppe | Bell pepper cream soup | noon | [Deprecated. Use tags→special instead.] | 2.05 | 3.75 | 4.85 | WARNING: These fields currently contain incorr... | ... | Mensa 1 TU Braunschweig | {'line1': 'Mensa 1 TU Braunschweig', 'line2': ... | [{'time': 'noon', 'start_day': 1, 'end_day': 4... | 10 | Suppe & Co. | Soup & Co. | [{'id': 'VEGA', 'name': 'Vegan', 'name_en': 'v... | [{'id': 'SO', 'name': 'enthält Soja(bohnen)u d... | [{'id': '2', 'name': 'mit Konservierungsstoff'... | [] |
1 rows × 22 columns
Analyzing Price Distribution:
aggregation_functions = {'date': 'first',
'id': 'first',
'price.student': 'mean',
'price.employee':"mean",
"price.guest":"mean",
"location.id":"first",
"lane.id":"first"}
df_new = df_meals.groupby(df_meals['date']).aggregate(aggregation_functions)
df_new
date | id | price.student | price.employee | price.guest | location.id | lane.id | |
---|---|---|---|---|---|---|---|
date | |||||||
2023-12-18 | 2023-12-18 | 50181 | 2.020 | 3.185 | 3.700 | 101 | 10 |
2023-12-19 | 2023-12-19 | 50189 | 1.975 | 3.235 | 3.715 | 101 | 10 |
2023-12-20 | 2023-12-20 | 50397 | 2.020 | 3.245 | 3.760 | 101 | 10 |
2023-12-21 | 2023-12-21 | 50225 | 2.135 | 3.300 | 3.815 | 101 | 10 |
#mean_price.plot
plt.plot(df_new["date"], df_new["price.student"], color="yellow",label="student")
plt.plot(df_new["date"], df_new["price.employee"],color= "red", label="employee")
plt.plot(df_new["date"], df_new["price.guest"], color= "purple", label="guest")
plt.title('Prices Mensa over time')
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()
Data Analysis¶
Setup¶
import pandas as pd
import matplotlib.pyplot as plt
Next read in the dataset survey.csv
. For testing puposes use the Variable Name df
.
df = pd.read_csv("survey.csv")
df.head()
Age | Sex | Scale Python Exp | Course | Has Voice Assistent Contact | Voice Assistent | Scale Study Satisfaction | Uses Smartphone | Which Smartphone | Has Computer | Which OS | Scale Programming Exp | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 22 | Männlich | 4 | Medienwissenschaften | Ja | Apple Siri | 4 | Ja | Apple | Ja | Mac OS | 2 |
1 | 26 | Weiblich | 3 | Medienwissenschaften | Ja | Amazon Alexa | 2 | Ja | Xiaomi | Ja | Windows 10 | 3 |
2 | 21 | Männlich | 3 | Medienwissenschaften | Ja | Google Now | 4 | Ja | Sonstige | Ja | Windows 10 | 3 |
3 | 26 | Weiblich | 4 | Medienwissenschaften | Ja | Apple Siri | 4 | Ja | Samsung | Ja | Windows 10 | 2 |
4 | 24 | Weiblich | 4 | Psychologie | Nein | NaN | 4 | Ja | Apple | Ja | Windows 11 | 3 |
Hypothesis 1¶
The average age in the course is 25.34 years, with a surplus of females.
print("Mean Age:", df["Age"].mean())
c = df["Sex"].value_counts()
plt.title("Gender Distribution")
plt.bar(c.keys(), c, color=["black", "white"])
plt.show()
Mean Age: 22.64
Hypothesis 2¶
The most used voice assistant is Alexa.
c = df["Voice Assistent"].value_counts()
plt.title("Voice Assistent Distribution")
plt.bar(c.keys(), c, color=["red", "green", "blue"])
plt.show()
Hypothesis 3¶
Hypothesis 3.1¶
The least used smartphone operating system is iOS.
os_dict = {
"iOS": 0,
"Android": 0
}
for data in df["Which Smartphone"]:
if data == "Apple":
os_dict["iOS"] += 1
else:
os_dict["Android"] += 1
plt.title("Mobile OS Distribution")
plt.bar(os_dict.keys(), os_dict.values(), color=["silver", "green"])
plt.show()
Hypothesis 3.2¶
The most used desktop operating system is Windows 10.
os_dict = df["Which OS"].value_counts()
plt.title("Desktop OS Distribution")
plt.bar(os_dict.keys(), os_dict, color=["grey", "silver", "cyan", "yellow"])
plt.show()
Hypothesis 4¶
Hypothesis 4.1 & 4.2¶
The youngest people use an iPhone. Older people use an Android-based smartphone.
age_mean = df["Age"].mean()
youngest = {
"ios": 0,
"Android": 0,
"n": 0,
"ratio iOS": 0,
"ratio Android": 0
}
oldest = {
"ios": 0,
"Android": 0,
"n": 0,
"ratio iOS": 0,
"ratio Android": 0
}
Count appreances:
for index, row in df[["Age", "Which Smartphone"]].iterrows():
if age_mean > row["Age"]:
youngest["n"] += 1
if row["Which Smartphone"] == "Apple":
youngest["ios"] += 1
else:
youngest["Android"] += 1
if age_mean < row["Age"]:
oldest["n"] += 1
if row["Which Smartphone"] != "Apple":
oldest["Android"] += 1
else:
oldest["ios"] += 1
Calc Ratios:
youngest["ratio iOS"] = youngest["ios"] / youngest["n"] * 100
youngest["ratio Android"] = youngest["Android"] / youngest["n"] * 100
oldest["ratio iOS"] = oldest["ios"] / oldest["n"] * 100
oldest["ratio Android"] = oldest["Android"] / oldest["n"] * 100
plt.title("iOS Ratio between mean Age")
plt.bar(["Y: mean iOS", "Y: mean Android", "O: mean iOS", "O: mean Android"],
[youngest["ratio iOS"], youngest["ratio Android"], oldest["ratio iOS"], oldest["ratio Android"]], color=["gold", "silver"])
plt.show()