\title{Summary of Citibike miniproject}
\author[1]{Yaxuan Yin}%
\affil[1]{New York University Shanghai}%
\textbf{Abstract -}
Citi Bike is current biggest sharing system in New York. As we known,
there are a lot of external factors like the location of MTA stations
determining the number of customers, trip duration and so on. What is
more, there are some internal factors like gender, age taking the
important role in the Citi bike analysis as well.
I will use the birth year of 1990 as the line , focus on the people who
born before and after 1990. And aim to develop is there any relationship
between age and customer.
\textbf{Introduction -}
Like all other sharing systems, ~Airbnb the housing sharing system, Uber
the car sharing system, Citi Bike is the network of bicycle rental
stations intended for point-to-point transportation.
Citi Bike is New York City's largest bike sharing system. It's a
convenient solution for trips that are too far to walk but too short for
a taxi or the subway. The bike sharing system is combined with all other
transportation methods available in the area for commuters.
Currently, there are about a million trips on average per month by Citi
Bike riders. The system has 10,000 bicycles and 610 stations. By end of
2017, the total size of Citi Bike system will be 12,000 bikes and 750
stations.
\href{https://github.com/fedhere/PUI2018_fb55/tree/master/HW8_fb55\#data--}{}\textbf{Data
-}
The data is grabbed from Citi Bike open
data(~\url{https://www.citibikenyc.com/system-data}) starting from 2015.
I divided the data into 2 data frames by the birth year of 1990.
As the figure1 and figure2, the bar is concentrated by 1. It shows that~
the birth year of users is relatively concentrated.
Figure3 shows the scatter of birth year and trip duration simply.\selectlanguage{english}
\textbf{Methodology -}
NULL HYPOTHESIS:~
The proportion of weekend trips taken by riders born after 1990 is the
same or less than the proportion of weekend trips taken by riders born
in or before 1990.
H0:~
proportion(younger) - proportion(older) \textless{}= 0
ALTERNATIVE HYPOTHESIS:~
The proportion of weekend trips taken by riders born after 1990 is
greater than the proportion of weekend trips taken by riders born in or
before 1990.
HA:~
proportion(younger) - proportion(older) \textgreater{} 0
Alpha level: 0.05
I use Z-test to test the proportions of people born before 1990 and
after 1990. I defined the people who born before 1990 as 1, after 1990
as 0 and to sum up them by function of groupby.~\selectlanguage{english}
\subsection*{Conclusions -}\label{conclusions--}
\par\null
According the he z statistics, we can not reject the Null Hypothesis
that the proportion of weekend trips taken by riders born after 1990 is
the same or less than the proportion of weekend trips taken by riders
born in or before 1990. In conclusion, The proportion of riders born
before 1990 takes a significant part. The weaknesses of analysis is lack
the compare the same data between the different time to make sure its
validation. And if Citi bike want to take more portion of market, they
should make effort to promote the people who born after 1990 using Citi
Bike.\selectlanguage{english}
The result and its significance, including the weaknesses and strengths
of the analysis. Either this session of the previous one should contain
figures as well to show the results.
The analysis notebook is attached here:~
\url{https://github.com/ShellyYoon/PUI2018\_yy2908/blob/master/HW8\_yy2908/HW8\_Assignment2.ipynb}
