Author of Speedml explains data science solutions for beginners. Includes chapter on Top 20 most voted Kaggle solution.
Speedml is a Python package for speed starting machine learning projects.
The idea behind the package is to codify data science journey of experts creating top ranking ML solutions on competition sites like Kaggle, into an easy, productive, and intuitive Python library for ML beginners.
Install Speedml package using pip
using the following command.
pip install speedml
The Speedml.com website documents API use cases, behind-the-scenes implementation, features, best practices, and demos in detail.
Speedml is open source and available under MIT license. We manage the project on GitHub.
We are authoring Speedml API with four goals in mind.
Speedml already imports and properly initializes the popular ML packages including pandas, numpy, sklearn, xgboost, and matplotlib. All you need to do is import speedml to get started.
from speedml import Speedml
Coding is up to 3X faster when using Speedml because of (1) iterative development, (2) linear workflow, and (3) component-based API.
These two lines of Speedml code (a) load the training, test datasets, (b) define the target and unique id features, (c) plot the feature correlation matrix heatmap for numerical features.
sml = Speedml('train.csv', 'test.csv',
target='Survived', uid='PassengerId')
sml.plot.correlate()
A notebook using Speedml reduces coding required by up to 70%. Speedml API implements methods requiring zero to minimal number of parameters, working on sensible defaults.
Call to this single method replaces empty values in the entire dataframe with median value for numerical features and most common values for text features.
sml.feature.impute()
Understanding machine learning fundamentals is a breeze with Speedml as we have designed the API to follow a linear workflow with sensible prerequisites and intuitive next steps.
These three lines of Speedml code perform feature engineering by replacing null values, extracting a new feature matching a regular expression, and dropping a feature that is no longer required.
sml.feature.fillna(a='Cabin', new='Z')
sml.feature.extract(new='Deck', a='Cabin', regex='([A-Z]){1}')
sml.feature.drop(['Cabin'])
Hope you enjoy using Speedml in your projects. Watch this space as we intend to update Speedml frequently with more cool features.