EBookClubs

Read Books & Download eBooks Full Online

EBookClubs

Read Books & Download eBooks Full Online

Book Directory and Project Forecasts

Download or read book Directory and Project Forecasts written by Ronald J. Walton and published by . This book was released on 1968 with total page 100 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book The Great Lakes

    Book Details:
  • Author : United States. Congress. House. Committee on Foreign Affairs. Subcommittee on Inter-American Affairs
  • Publisher :
  • Release : 1973
  • ISBN :
  • Pages : 788 pages

Download or read book The Great Lakes written by United States. Congress. House. Committee on Foreign Affairs. Subcommittee on Inter-American Affairs and published by . This book was released on 1973 with total page 788 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book The Great Lakes

    Book Details:
  • Author : United States. Congress. House Foreign Affairs Committee
  • Publisher :
  • Release : 1973
  • ISBN :
  • Pages : 766 pages

Download or read book The Great Lakes written by United States. Congress. House Foreign Affairs Committee and published by . This book was released on 1973 with total page 766 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Directory and Project Forecasts

Download or read book Directory and Project Forecasts written by Canadian Committee on Oceanography and published by . This book was released on 1969 with total page 416 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Catalog of the United States Geological Survey

Download or read book Catalog of the United States Geological Survey written by U.S. Geological Survey Library and published by MacMillan Publishing Company. This book was released on 1976 with total page 774 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Directory and Project Forecasts  1969 1970  the Great Lakes and Other Large Lakes

Download or read book Directory and Project Forecasts 1969 1970 the Great Lakes and Other Large Lakes written by Canadian Committee on Oceanography and published by . This book was released on 1970 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Selected Water Resources Abstracts

Download or read book Selected Water Resources Abstracts written by and published by . This book was released on 1974 with total page 1204 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book EIA Publications Directory

Download or read book EIA Publications Directory written by and published by . This book was released on 1981 with total page 218 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Book Catalog of the Library and Information Services Division  Shelf List catalog

Download or read book Book Catalog of the Library and Information Services Division Shelf List catalog written by Environmental Science Information Center. Library and Information Services Division and published by . This book was released on 1977 with total page 578 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book FOUR PROJECTS  PREDICTION AND FORECASTING USING MACHINE LEARNING WITH PYTHON

Download or read book FOUR PROJECTS PREDICTION AND FORECASTING USING MACHINE LEARNING WITH PYTHON written by Vivian Siahaan and published by BALIGE PUBLISHING. This book was released on 2022-05-25 with total page 612 pages. Available in PDF, EPUB and Kindle. Book excerpt: PROJECT 1: GOLD PRICE ANALYSIS AND FORECASTING USING MACHINE LEARNING WITH PYTHON The challenge of this project is to accurately predict the future adjusted closing price of Gold ETF across a given period of time in the future. The problem is a regression problem, because the output value which is the adjusted closing price in this project is continuous value. Data for this study is collected from November 18th 2011 to January 1st 2019 from various sources. The data has 1718 rows in total and 80 columns in total. Data for attributes, such as Oil Price, Standard and Poor’s (S&P) 500 index, Dow Jones Index US Bond rates (10 years), Euro USD exchange rates, prices of precious metals Silver and Platinum and other metals such as Palladium and Rhodium, prices of US Dollar Index, Eldorado Gold Corporation and Gold Miners ETF were gathered. The dataset has 1718 rows in total and 80 columns in total. Data for attributes, such as Oil Price, Standard and Poor’s (S&P) 500 index, Dow Jones Index US Bond rates (10 years), Euro USD exchange rates, prices of precious metals Silver and Platinum and other metals such as Palladium and Rhodium, prices of US Dollar Index, Eldorado Gold Corporation and Gold Miners ETF were gathered. To perform forecasting based on regression adjusted closing price of gold, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. The machine learning models used predict gold daily returns as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, MLP classifier, and Extra Trees classifier. Finally, you will plot boundary decision, distribution of features, feature importance, predicted values versus true values, confusion matrix, learning curve, performance of the model, and scalability of the model. PROJECT 2: WIND POWER ANALYSIS AND FORECASTING USING MACHINE LEARNING WITH PYTHON Renewable energy remains one of the most important topics for a sustainable future. Wind, being a perennial source of power, could be utilized to satisfy our power requirements. With the rise of wind farms, wind power forecasting would prove to be quite useful. It contains various weather, turbine and rotor features. Data has been recorded from January 2018 till March 2020. Readings have been recorded at a 10-minute interval. A longterm wind forecasting technique is thus required. The attributes in the dataset are as follows: ActivePower, AmbientTemperature, BearingShaftTemperature, Blade1PitchAngle, Blade2PitchAngle, Blade3PitchAngle, ControlBoxTemperature, GearboxBearingTemperature, GearboxOilTemperature, GeneratorRP, GeneratorWinding1Temperature, GeneratorWinding2Temperature, HubTemperature, MainBoxTemperature, NacellePosition, ReactivePower, RotorRPM, TurbineStatus, WTG, WindDirection, and WindSpeed. To perform forecasting based on regression active power, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. To perform clustering, you will use K-Means algorithm. The machine learning models used predict categorized active power as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 3: MACHINE LEARNING FOR CONCRETE COMPRESSIVE STRENGTH ANALYSIS AND PREDICTION WITH PYTHON Concrete is the most important material in civil engineering. The concrete compressive strength is a highly nonlinear function of age and ingredients. These ingredients include cement, blast furnace slag, fly ash, water, superplasticizer, coarse aggregate, and fine aggregate. The actual concrete compressive strength (MPa) for a given mixture under a specific age (days) was determined from laboratory. This dataset is in raw form (not scaled). There are 1030 observations, 9 attributes, 8 quantitative input variables, and 1 quantitative output variable in dataset. The attributes in the dataset are as follows: Cement (component 1); Blast Furnace Slag (component 2); Fly Ash (component 3); Water (component 4); Superplasticizer (component 5); Coarse Aggregate; Fine Aggregate (component 7); Age; and Concrete compressive strength. To perform regression on concrete compressive strength, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. To perform clustering, you will use K-Means algorithm. The machine learning models used predict clusters as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 4: DATA SCIENCE FOR SALES ANALYSIS, FORECASTING, CLUSTERING, AND PREDICTION WITH PYTHON The dataset used in this project is from Walmart which is a renowned retail corporation that operates a chain of hypermarkets. Walmart has provided a data combining of 45 stores including store information and monthly sales. The data is provided on weekly basis. Walmart tries to find the impact of holidays on the sales of store. For which it has included four holidays’ weeks into the dataset which are Christmas, Thanksgiving, Super bowl, Labor Day. In this project, you are going to analyze, forecast weekly sales, perform clustering, and predict the resulting clusters. The dataset covers sales from 2010-02-05 to 2012-11-01. Following are the attributes in the dataset: Store - the store number; Date - the week of sales; Weekly_Sales - sales for the given store; Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week; Temperature - Temperature on the day of sale; Fuel_Price - Cost of fuel in the region; CPI – Prevailing consumer price index; and Unemployment - Prevailing unemployment rate. To perform regression on weekly sales, you will use: Linear Regression, Random Forest regression, Decision Tree regression, Support Vector Machine regression, Naïve Bayes regression, K-Nearest Neighbor regression, Adaboost regression, Gradient Boosting regression, Extreme Gradient Boosting regression, Light Gradient Boosting regression, Catboost regression, and MLP regression. To perform clustering, you will use K-Means algorithm. The machine learning models used predict clusters as target variable are K-Nearest Neighbor classifier, Random Forest classifier, Naive Bayes classifier, Logistic Regression classifier, Decision Tree classifier, Support Vector Machine classifier, LGBM classifier, Gradient Boosting classifier, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.

Book Catalog of the United States Geological Survey Library

Download or read book Catalog of the United States Geological Survey Library written by U.S. Geological Survey Library and published by . This book was released on 1976 with total page 778 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book The Great Lakes  Hearings  The 1973 floods and activities of the International Joint Commission  United States and Canada

Download or read book The Great Lakes Hearings The 1973 floods and activities of the International Joint Commission United States and Canada written by United States. Congress. House. Committee on Foreign Affairs. Subcommittee on Inter-American Affairs and published by . This book was released on 1973 with total page 884 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Great Lakes Basin Library  Title arrangemnt

Download or read book Great Lakes Basin Library Title arrangemnt written by United States. Great Lakes Basin Commission. Library and published by . This book was released on 1969 with total page 458 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book Book Catalog of the Library and Information Services Division  Author title series indexes

Download or read book Book Catalog of the Library and Information Services Division Author title series indexes written by Environmental Science Information Center. Library and Information Services Division and published by . This book was released on 1977 with total page 512 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Book REGRESSION  SEGMENTATION  CLUSTERING  AND PREDICTION PROJECTS WITH PYTHON

Download or read book REGRESSION SEGMENTATION CLUSTERING AND PREDICTION PROJECTS WITH PYTHON written by Vivian Siahaan and published by BALIGE PUBLISHING. This book was released on 2022-02-25 with total page 623 pages. Available in PDF, EPUB and Kindle. Book excerpt: PROJECT 1: TIME-SERIES WEATHER: FORECASTING AND PREDICTION WITH PYTHON Weather data are described and quantified by the variables of Earth's atmosphere: temperature, air pressure, humidity, and the variations and interactions of these variables, and how they change over time. Different spatial scales are used to describe and predict weather on local, regional, and global levels. The dataset used in this project contains weather data for New Delhi, India. This data was taken out from wunderground. It contains various features such as temperature, pressure, humidity, rain, precipitation, etc. The main target is to develop a prediction model accurate enough for forecasting temperature and predicting target variable (condition). Time-series weather forecasting will be done using ARIMA models. The machine learning models used in this project to predict target variable (condition) are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM classifier, Gradient Boosting, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 2: HOUSE PRICE: ANALYSIS AND PREDICTION USING MACHINE LEARNING WITH PYTHON The dataset used in this project is taken from the second chapter of Aurélien Géron's recent book 'Hands-On Machine learning with Scikit-Learn and TensorFlow'. It serves as an excellent introduction to implementing machine learning algorithms because it requires rudimentary data cleaning, has an easily understandable list of variables and sits at an optimal size between being to toyish and too cumbersome. The data contains information from the 1990 California census. Although it may not help you with predicting current housing prices like the Zillow Zestimate dataset, it does provide an accessible introductory dataset for teaching people about the basics of machine learning. The data pertains to the houses found in a given California district and some summary stats about them based on the 1990 census data. Be warned the data aren't cleaned so there are some preprocessing steps required! The columns are as follows: longitude, latitude, housing_median_age, total_rooms, total_bedrooms, population, households, median_income, median_house_value, and ocean_proximity. The machine learning models used in this project used to perform regression on median_house_value and to predict it as target variable are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM classifier, Gradient Boosting, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 3: CUSTOMER PERSONALITY ANALYSIS AND PREDICTION USING MACHINE LEARNING WITH PYTHON Customer Personality Analysis is a detailed analysis of a company’s ideal customers. It helps a business to better understand its customers and makes it easier for them to modify products according to the specific needs, behaviors and concerns of different types of customers. Customer personality analysis helps a business to modify its product based on its target customers from different types of customer segments. For example, instead of spending money to market a new product to every customer in the company’s database, a company can analyze which customer segment is most likely to buy the product and then market the product only on that particular segment. Following are the features in the dataset: ID = Customer's unique identifier; Year_Birth = Customer's birth year; Education = Customer's education level; Marital_Status = Customer's marital status; Income = Customer's yearly household income; Kidhome = Number of children in customer's household; Teenhome = Number of teenagers in customer's household; Dt_Customer = Date of customer's enrollment with the company; Recency = Number of days since customer's last purchase; MntWines = Amount spent on wine in the last 2 years; MntFruits = Amount spent on fruits in the last 2 years; MntMeatProducts = Amount spent on meat in the last 2 years; MntFishProducts = Amount spent on fish in the last 2 years; MntSweetProducts = Amount spent on sweets in the last 2 years; MntGoldProds = Amount spent on gold in the last 2 years; NumDealsPurchases = Number of purchases made with a discount; NumWebPurchases = Number of purchases made through the company's web site; NumCatalogPurchases = Number of purchases made using a catalogue; NumStorePurchases = Number of purchases made directly in stores; NumWebVisitsMonth = Number of visits to company's web site in the last month; AcceptedCmp3 = 1 if customer accepted the offer in the 3rd campaign, 0 otherwise; AcceptedCmp4 = 1 if customer accepted the offer in the 4th campaign, 0 otherwise; AcceptedCmp5 = 1 if customer accepted the offer in the 5th campaign, 0 otherwise; AcceptedCmp1 = 1 if customer accepted the offer in the 1st campaign, 0 otherwise; AcceptedCmp2 = 1 if customer accepted the offer in the 2nd campaign, 0 otherwise; Response = 1 if customer accepted the offer in the last campaign, 0 otherwise; and Complain = 1 if customer complained in the last 2 years, 0 otherwise. The target in this project is to perform clustering and predicting to summarize customer segments. In this project, you will perform clustering using KMeans to get 4 clusters. The machine learning models used in this project to perform regression on total number of purchase and to predict clusters as target variable are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM, Gradient Boosting, XGB, and MLP. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 4: CUSTOMER SEGMENTATION, CLUSTERING, AND PREDICTION WITH PYTHON In this project, you will develop a customer segmentation, clustering, and prediction to define marketing strategy. The sample dataset summarizes the usage behavior of about 9000 active credit card holders during the last 6 months. The file is at a customer level with 18 behavioral variables. Following is the Data Dictionary for Credit Card dataset: CUSTID: Identification of Credit Card holder (Categorical); BALANCE: Balance amount left in their account to make purchases; BALANCEFREQUENCY: How frequently the Balance is updated, score between 0 and 1 (1 = frequently updated, 0 = not frequently updated); PURCHASES: Amount of purchases made from account; ONEOFFPURCHASES: Maximum purchase amount done in one-go; INSTALLMENTSPURCHASES: Amount of purchase done in installment; CASHADVANCE: Cash in advance given by the user; PURCHASESFREQUENCY: How frequently the Purchases are being made, score between 0 and 1 (1 = frequently purchased, 0 = not frequently purchased); ONEOFFPURCHASESFREQUENCY: How frequently Purchases are happening in one-go (1 = frequently purchased, 0 = not frequently purchased); PURCHASESINSTALLMENTSFREQUENCY: How frequently purchases in installments are being done (1 = frequently done, 0 = not frequently done); CASHADVANCEFREQUENCY: How frequently the cash in advance being paid; CASHADVANCETRX: Number of Transactions made with "Cash in Advanced"; PURCHASESTRX: Number of purchase transactions made; CREDITLIMIT: Limit of Credit Card for user; PAYMENTS: Amount of Payment done by user; MINIMUM_PAYMENTS: Minimum amount of payments made by user; PRCFULLPAYMENT: Percent of full payment paid by user; and TENURE: Tenure of credit card service for user. In this project, you will perform clustering using KMeans to get 5 clusters. The machine learning models used in this project to perform regression on total number of purchase and to predict clusters as target variable are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM, Gradient Boosting, XGB, and MLP. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.

Book THREE PROJECTS  Sentiment Analysis and Prediction Using Machine Learning and Deep Learning with Python GUI

Download or read book THREE PROJECTS Sentiment Analysis and Prediction Using Machine Learning and Deep Learning with Python GUI written by Vivian Siahaan and published by BALIGE PUBLISHING. This book was released on 2022-03-21 with total page 620 pages. Available in PDF, EPUB and Kindle. Book excerpt: PROJECT 1: TEXT PROCESSING AND SENTIMENT ANALYSIS USING MACHINE LEARNING AND DEEP LEARNING WITH PYTHON GUI Twitter data used in this project was scraped from February of 2015 and contributors were asked to first classify positive, negative, and neutral tweets, followed by categorizing negative reasons (such as "late flight" or "rude service"). This data was originally posted by Crowdflower last February and includes tweets about 6 major US airlines. Additionally, Crowdflower had their workers extract the sentiment from the tweet as well as what the passenger was dissapointed about if the tweet was negative. The information of main attributes for this project are as follows: airline_sentiment : Sentiment classification.(positivie, neutral, and negative); negativereason : Reason selected for the negative opinion; airline : Name of 6 US Airlines('Delta', 'United', 'Southwest', 'US Airways', 'Virgin America', 'American'); and text : Customer's opinion. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, and XGB classifier, and LSTM. Three vectorizers used in machine learning are Hashing Vectorizer, Count Vectorizer, and TFID Vectorizer. Finally, you will develop a GUI using PyQt5 to plot cross validation score, predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 2: HOTEL REVIEW: SENTIMENT ANALYSIS USING MACHINE LEARNING AND DEEP LEARNING WITH PYTHON GUI The data used in this project is the data published by Anurag Sharma about hotel reviews that were given by costumers. The data is given in two files, a train and test. The train.csv is the training data, containing unique User_ID for each entry with the review entered by a costumer and the browser and device used. The target variable is Is_Response, a variable that states whether the costumers was happy or not happy while staying in the hotel. This type of variable makes the project to a classification problem. The test.csv is the testing data, contains similar headings as the train data, without the target variable. The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, and XGB classifier, and LSTM. Three vectorizers used in machine learning are Hashing Vectorizer, Count Vectorizer, and TFID Vectorizer. Finally, you will develop a GUI using PyQt5 to plot cross validation score, predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 3: STUDENT ACADEMIC PERFORMANCE ANALYSIS AND PREDICTION USING MACHINE LEARNING WITH PYTHON GUI The dataset used in this project consists of student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school-related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful. Attributes in the dataset are as follows: school - student's school (binary: 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira); sex - student's sex (binary: 'F' - female or 'M' - male); age - student's age (numeric: from 15 to 22); address - student's home address type (binary: 'U' - urban or 'R' - rural); famsize - family size (binary: 'LE3' - less or equal to 3 or 'GT3' - greater than 3); Pstatus - parent's cohabitation status (binary: 'T' - living together or 'A' - apart); Medu - mother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 - 5th to 9th grade, 3 - secondary education or 4 - higher education); Fedu - father's education (numeric: 0 - none, 1 - primary education (4th grade), 2 - 5th to 9th grade, 3 - secondary education or 4 - higher education); Mjob - mother's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other'); Fjob - father's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other'); reason - reason to choose this school (nominal: close to 'home', school 'reputation', 'course' preference or 'other'); guardian - student's guardian (nominal: 'mother', 'father' or 'other'); traveltime - home to school travel time (numeric: 1 - <15 min., 2 - 15 to 30 min., 3 - 30 min. to 1 hour, or 4 - >1 hour); studytime - weekly study time (numeric: 1 - <2 hours, 2 - 2 to 5 hours, 3 - 5 to 10 hours, or 4 - >10 hours); failures - number of past class failures (numeric: n if 1<=n<3, else 4); schoolsup - extra educational support (binary: yes or no); famsup - family educational support (binary: yes or no); paid - extra paid classes within the course subject (Math or Portuguese) (binary: yes or no); activities - extra-curricular activities (binary: yes or no); nursery - attended nursery school (binary: yes or no); higher - wants to take higher education (binary: yes or no); internet - Internet access at home (binary: yes or no); romantic - with a romantic relationship (binary: yes or no); famrel - quality of family relationships (numeric: from 1 - very bad to 5 - excellent); freetime - free time after school (numeric: from 1 - very low to 5 - very high); goout - going out with friends (numeric: from 1 - very low to 5 - very high); Dalc - workday alcohol consumption (numeric: from 1 - very low to 5 - very high); Walc - weekend alcohol consumption (numeric: from 1 - very low to 5 - very high); health - current health status (numeric: from 1 - very bad to 5 - very good); absences - number of school absences (numeric: from 0 to 93); G1 - first period grade (numeric: from 0 to 20); G2 - second period grade (numeric: from 0 to 20); and G3 - final grade (numeric: from 0 to 20, output target). The models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, and XGB classifier. Three feature scaling used in machine learning are raw, minmax scaler, and standard scaler. Finally, you will develop a GUI using PyQt5 to plot cross validation score, predicted values versus true values, confusion matrix, learning curve, decision boundaries, performance of the model, scalability of the model, training loss, and training accuracy.

Book ANALYSIS AND PREDICTION PROJECTS USING MACHINE LEARNING AND DEEP LEARNING WITH PYTHON

Download or read book ANALYSIS AND PREDICTION PROJECTS USING MACHINE LEARNING AND DEEP LEARNING WITH PYTHON written by Vivian Siahaan and published by BALIGE PUBLISHING. This book was released on 2022-02-17 with total page 860 pages. Available in PDF, EPUB and Kindle. Book excerpt: PROJECT 1: DEFAULT LOAN PREDICTION BASED ON CUSTOMER BEHAVIOR Using Machine Learning and Deep Learning with Python In finance, default is failure to meet the legal obligations (or conditions) of a loan, for example when a home buyer fails to make a mortgage payment, or when a corporation or government fails to pay a bond which has reached maturity. A national or sovereign default is the failure or refusal of a government to repay its national debt. The dataset used in this project belongs to a Hackathon organized by "Univ.AI". All values were provided at the time of the loan application. Following are the features in the dataset: Income, Age, Experience, Married/Single, House_Ownership, Car_Ownership, Profession, CITY, STATE, CURRENT_JOB_YRS, CURRENT_HOUSE_YRS, and Risk_Flag. The Risk_Flag indicates whether there has been a default in the past or not. The machine learning models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, Adaboost, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 2: AIRLINE PASSENGER SATISFACTION Analysis and Prediction Using Machine Learning and Deep Learning with Python The dataset used in this project contains an airline passenger satisfaction survey. In this case, you will determine what factors are highly correlated to a satisfied (or dissatisfied) passenger and predict passenger satisfaction. Below are the features in the dataset: Gender: Gender of the passengers (Female, Male); Customer Type: The customer type (Loyal customer, disloyal customer); Age: The actual age of the passengers; Type of Travel: Purpose of the flight of the passengers (Personal Travel, Business Travel); Class: Travel class in the plane of the passengers (Business, Eco, Eco Plus); Flight distance: The flight distance of this journey; Inflight wifi service: Satisfaction level of the inflight wifi service (0:Not Applicable;1-5); Departure/Arrival time convenient: Satisfaction level of Departure/Arrival time convenient; Ease of Online booking: Satisfaction level of online booking; Gate location: Satisfaction level of Gate location; Food and drink: Satisfaction level of Food and drink; Online boarding: Satisfaction level of online boarding; Seat comfort: Satisfaction level of Seat comfort; Inflight entertainment: Satisfaction level of inflight entertainment; On-board service: Satisfaction level of On-board service; Leg room service: Satisfaction level of Leg room service; Baggage handling: Satisfaction level of baggage handling; Check-in service: Satisfaction level of Check-in service; Inflight service: Satisfaction level of inflight service; Cleanliness: Satisfaction level of Cleanliness; Departure Delay in Minutes: Minutes delayed when departure; Arrival Delay in Minutes: Minutes delayed when Arrival; and Satisfaction: Airline satisfaction level (Satisfaction, neutral or dissatisfaction) The machine learning models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 3: CREDIT CARD CHURNING CUSTOMER ANALYSIS AND PREDICTION USING MACHINE LEARNING AND DEEP LEARNING WITH PYTHON The dataset used in this project consists of more than 10,000 customers mentioning their age, salary, marital_status, credit card limit, credit card category, etc. There are 20 features in the dataset. In the dataset, there are only 16.07% of customers who have churned. Thus, it's a bit difficult to train our model to predict churning customers. Following are the features in the dataset: 'Attrition_Flag', 'Customer_Age', 'Gender', 'Dependent_count', 'Education_Level', 'Marital_Status', 'Income_Category', 'Card_Category', 'Months_on_book', 'Total_Relationship_Count', 'Months_Inactive_12_mon', 'Contacts_Count_12_mon', 'Credit_Limit', 'Total_Revolving_Bal', 'Avg_Open_To_Buy', 'Total_Amt_Chng_Q4_Q1', 'Total_Trans_Amt', 'Total_Trans_Ct', 'Total_Ct_Chng_Q4_Q1', and 'Avg_Utilization_Ratio',. The target variable is 'Attrition_Flag'. The machine learning models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 4: MARKETING ANALYSIS AND PREDICTION USING MACHINE LEARNING AND DEEP LEARNING WITH PYTHON This data set was provided to students for their final project in order to test their statistical analysis skills as part of a MSc. in Business Analytics. It can be utilized for EDA, Statistical Analysis, and Visualizations. Following are the features in the dataset: ID = Customer's unique identifier; Year_Birth = Customer's birth year; Education = Customer's education level; Marital_Status = Customer's marital status; Income = Customer's yearly household income; Kidhome = Number of children in customer's household; Teenhome = Number of teenagers in customer's household; Dt_Customer = Date of customer's enrollment with the company; Recency = Number of days since customer's last purchase; MntWines = Amount spent on wine in the last 2 years; MntFruits = Amount spent on fruits in the last 2 years; MntMeatProducts = Amount spent on meat in the last 2 years; MntFishProducts = Amount spent on fish in the last 2 years; MntSweetProducts = Amount spent on sweets in the last 2 years; MntGoldProds = Amount spent on gold in the last 2 years; NumDealsPurchases = Number of purchases made with a discount; NumWebPurchases = Number of purchases made through the company's web site; NumCatalogPurchases = Number of purchases made using a catalogue; NumStorePurchases = Number of purchases made directly in stores; NumWebVisitsMonth = Number of visits to company's web site in the last month; AcceptedCmp3 = 1 if customer accepted the offer in the 3rd campaign, 0 otherwise; AcceptedCmp4 = 1 if customer accepted the offer in the 4th campaign, 0 otherwise; AcceptedCmp5 = 1 if customer accepted the offer in the 5th campaign, 0 otherwise; AcceptedCmp1 = 1 if customer accepted the offer in the 1st campaign, 0 otherwise; AcceptedCmp2 = 1 if customer accepted the offer in the 2nd campaign, 0 otherwise; Response = 1 if customer accepted the offer in the last campaign, 0 otherwise; Complain = 1 if customer complained in the last 2 years, 0 otherwise; and Country = Customer's location. The machine and deep learning models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM classifier, Gradient Boosting, XGB classifier, MLP classifier, and CNN 1D. Finally, you will plot boundary decision, ROC, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy. PROJECT 5: METEOROLOGICAL DATA ANALYSIS AND PREDICTION USING MACHINE LEARNING WITH PYTHON Meteorological phenomena are described and quantified by the variables of Earth's atmosphere: temperature, air pressure, water vapour, mass flow, and the variations and interactions of these variables, and how they change over time. Different spatial scales are used to describe and predict weather on local, regional, and global levels. The dataset used in this project consists of meteorological data with 96453 total number of data points and with 11 attributes/columns. Following are the columns in the dataset: Formatted Date; Summary; Precip Type; Temperature (C); Apparent Temperature (C); Humidity; Wind Speed (km/h); Wind Bearing (degrees); Visibility (km); Pressure (millibars); and Daily Summary. The machine learning models used in this project are K-Nearest Neighbor, Random Forest, Naive Bayes, Logistic Regression, Decision Tree, Support Vector Machine, LGBM classifier, Gradient Boosting, XGB classifier, and MLP classifier. Finally, you will plot boundary decision, distribution of features, feature importance, cross validation score, and predicted values versus true values, confusion matrix, learning curve, performance of the model, scalability of the model, training loss, and training accuracy.