Purpose Statement and Model
The research seeks to determine the exact correlation of the salary of a player in Major League baseball and their statistically analyzed performance on the field. The main subject being discussed, salary forms the main dependent variable which uses the various independent variables derived from statistical data on a player to determine its credibility. The salary will be analyzed using some independent variables which include On-base percentage plus slugging (OPS), Experience represented in years (EXP), Batting Average and Stolen Base (SB). However, OPS is the most accurate independent variable among all the independent variables that exhibits the correlation to a large extent is OPS. This notion is suggested by the fact that it takes into consideration more than one variable in its consideration which include, on-base performance as well as slugging performance to get one number that evaluates a player (About OPS, n.d). Thus measuring the overall performance based on a number of skill sets which can in turn be effectively used to determine their salary.
Definition of Variables
To properly evaluate the player’s annual pay, their respective annual payroll must first be obtained. Based on past research, there has been a strong correlation in the players performance with the length of the contract on the performance of a player. In the determination of observations, the most accurate statistic to use is the OPS mainly because most of the past research that has been conducted on the performance and productivity of MLB players has been based on this independent variable (Houser, 2005). Thus by the use of the statistical approach; the data obtained on a player’s productivity and performance can be compared with past results. This proved to be the best method based on its use in the determination of the effects on productivity of a player brought about by the length of a contract on the (Stankiewicz, 2009).
Batting average, an independent variable, on the hand refers to the player’s battling ability. The average battling ability is normally calculated by dividing the number of hits the player has had by the number of bats (About Battling Average, n.d). This means that the battling average of a player can directly be related to their annual pay since it acts as a measure of performance.
As an independent variable, the stolen base determines the number of successful advancements a base runner has made to the next base. This variable also exhibits a positive relationship with a players annual pay.
Experience is the number of years the player has been playing at a professional level in the Major league sports. The most prevalent observation has been the positive correlation between the experience and the annual pay which is displayed in the higher pay for more experience players. However, the most correct way of explaining this is that with experience, the performance of a player is affected positively thus affecting the annual pay in return. However, the main reason behind the secondary use of this variable as an independent one is the fact that the salary based on experience can adversely be affected by the player’s performance at that time (Papps, 2010).
Data Description
Data pertaining 20 current MLB players is obtained from the Baseball reference website which provide details on the various variables to be used in the research. The data that will be collected from the database include BA, SB, annual salary, OPS in the years between 2016 and years of experience up to 2016. Random sampling will be used to collect the names of the players. Information pertaining to the five variables can then be selected from the summary of their salary and career data of 2016. Data pertaining to BA, SB, salary information and OPS+ are the relevant values being analyzed hence they can selectively collected from the database. Service time on the other hand will provide information on the years of experience of the players.
Analysis of the results
The fitted value of the salary evaluates the level of linearity in the use of regression model. Based on the Regression analysis, one can observe that there were both the negative and the positive residuals which have adopted a random distribution means that there is no specific trend that can realized from the results. The linearity of the assumption is also invalidated because of the randomized distribution.