Concretely applied to sales data in Jupyter Notebook step by step (part 2.2 of 12)

(Photo by Tim van Cleef on Unsplash)

This post is not about if you have Domingos excellent book about the master algorithm [1] in your shelf. Also not about why China’s president Xi Jinping does.

But this data driven dealings development (DDDD) series aims at people who want to learn the concepts of statistical analysis, machine learning (ML), deep learning (DL), artificial intelligence (AI), statistical process control (SPC), data mining and data science (DS) with sales data in practice. It’s meant as a truly exhaustive explanation starting from scratch with easy to adapt data. We’ll conquer the concepts of data science, explore our data, reflect on data…


What SAP tables to join to analyze efficiency

It’s all about efficiency (French rocks, image by author)

Motivation:

Even the best manufacturing processes will vary regarding their efficiency. Wouldn’t it be interesting to analyze efficiency over time, storing that data out of your ERP in a datawarehouse? Below’s SQL shows you how to get there, from a SAP BI point of view. Please note that there is no overall 100% correct and complete solution for this request since all SAP ERP are corporate unique. But it should definitely guide you the right way.

Solution:

If you are looking for efficiency, tables AFVV (operation quantity, value and date), AFVC (operation within an order) and CRHD (work center header)…


What SAP tables to join to analyze scrapping costs

Nobody’s perfect (Fondor, with permission from Matthias Böhler)

Motivation:

Even the best manufacturing processes will produce scrap, at least a little bit from time to time. Wouldn’t it be interesting to analyze these scrapping costs over time using, storing that data out of your ERP in a datawarehouse? Below’s SQL shows you how to get there, from a SAP BI point of view. Please note that there is no overall 100% correct and complete solution for this request, since all SAP ERP are corporate unique. But it should definitely guide you the right way.

Solution:

If you are looking for scrap costs, QMFE is the central SAP table…


Concretely applied to sales data step by step (part 2.1 of 12)

You do not need much to get started on Jupyter Notebook (Norwegian woods, image by author)

This data driven dealings development (DDDD) series aims at people who want to learn the concepts of statistical analysis, machine learning (ML), deep learning (DL), artificial intelligence (AI), statistical process control (SPC), data mining and data science (DS) with sales data in practice. It’s meant as a truly exhaustive explanation starting from scratch with easy to adapt data. We’ll conquer the concepts of data science, explore our data, reflect on data visualization and storytelling, predict future sales, mine market baskets and recommend products to customers. In the end we’ll build a data product in the shape of a complete ready…


What SAP tables to join to analyze Throughput Time

SAP tables related to Production Planning PP module (image by author)

Motivation:

Even the most stable manufacturing processes will vary with regards to the quantity of items produced per time. Wouldn’t it be interesting to analyze your throughput over time, storing that data out of your ERP in a data warehouse? Below’s SQL shows you how to get there, from a SAP BI point of view. Please note that there is no overall 100% correct and complete solution for this request, since all SAP ERP are corporate unique. But it should definitely guide you the right way.

Solution:

If you are looking for throughput times, AFKO is the central SAP table…


Concretely applied to sales data step by step (part 1 of 12)

Can you still see the (random) forest for the (decision) trees? (image by author)

This data driven dealings development (DDDD) series aims at people who want to learn the concepts of statistical analysis, machine learning (ML), deep learning (DL), artificial intelligence (AI), statistical process control (SPC), data mining and data science (DS) with sales data in practice. It’s meant as a truly exhaustive “real world” explanation from scratch. We’ll conquer the concepts of data science, explore our data, reflect on data visualization and storytelling, predict future sales, mine our market baskets and recommend products to our customers. In the end we’ll build a data product in the shape of a complete ready to go…


Will Europe’s ambitious data- and cloud-infrastructure project keep its promises?

Overcoming Data Lake Silos with GAIA-X? (image by paraglider Kim Rehberg)

Currently only few hyperscale cloud providers are dominating the market: Google Cloud Platform, Amazon Web Services, IBM Cloud, Microsoft Azure, and Alibaba Cloud, to name the probably most famous ones. They are all unified in not having a European origin. Living in Europe, if you want to store and analyze data in a cloud you will most likely end up with an American provider, which subordinates to the Cloud Act. And you will most likely find yourself being kind of locked-in to these giant global tech players. Europe’s digital infrastructure and analytics relies on non-European conglomerates who offer worldwide scalable…


Give wings to your Excel with Python Xlwings

Learn how to fly with Xlwings (Bavarian sky, image by author)

Motivation:

Oftentimes we conduct impressive Python analysis, but all that our colleagues are asking for is to receive the final results within Excel. Many of our colleagues do not want to work with Jupyter Notebook, Python scripts and the like. They just want to stick to their beloved spreadsheet tool (like Microsoft Excel, to call it by name). In “Excel-Python App for Non-Pythonists” we learned how to build a complete Excel-Python App for those colleagues. In this post we will inspect Xlwings in more depth. Xlwings gives us the power to develop interactive applications using Excel spreadsheets as the GUI (graphical…


Guide to easier import handling in Python

Reach the mountain’s top less exhausted using loops (image by author, antagonism makes the world go round)

Motivation:

Many times we have to import log file data from machines which have to be heavily prepared before they are digestible (also called Data Munging). This story provides you with a detailed explanation (including data and code) of readline, loops, split, pop, and regular expressions. This hopefully helps you importing and arranging your data quicker in the future.

Background:

Our robot’s machine log file looks like this:


Sparsity, Similarity, and implicit binary Collaborative Filtering explained step by step with Python Code

Recommender uses similarity as a complement to distance (image by author)

Motivation:

Knowing what items our customers should be highly interested in is vital in many businesses. Recapping our FPGrowth data mining, we experienced that quite often products are related to each other. So if one customer buys product A that customer will most likely also purchase product B, because both articles are related (for whatever reason, maybe like Beer and Diapers). Because of this association rule, we can estimate customers to purchase more than just one product within one sales transaction. To add some extra flavor, would it not be awesome to be able to recommend items to our customers…

Jesko Rehberg

Writing Books about Data Analysis using statistical and machine learning models at DAR-Analytics.com.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store