Tuesday, March 21, 2023
HomeBig DataExtract knowledge from SAP ERP utilizing AWS Glue and the SAP SDK

Extract knowledge from SAP ERP utilizing AWS Glue and the SAP SDK

This can be a visitor put up by Siva Manickam and Prahalathan M from Vyaire Medical Inc.

Vyaire Medical Inc. is a worldwide firm, headquartered in suburban Chicago, centered solely on supporting respiration via each stage of life. Established from legacy manufacturers with a 65-year historical past of pioneering respiration know-how, the corporate’s portfolio of built-in options is designed to allow, improve, and lengthen lives.

At Vyaire, our crew of 4,000 pledges to advance innovation and evolve what’s attainable to make sure each breath is taken to its fullest. Vyaire’s merchandise can be found in additional than 100 international locations and are acknowledged, trusted, and most popular by specialists all through the respiratory group worldwide. Vyaire has 65-year historical past of medical expertise and management with over 27,000 distinctive merchandise and 370,000 clients worldwide.

Vyaire Medical’s purposes panorama has a number of ERPs, reminiscent of SAP ECC, JD Edwards, Microsoft Dynamics AX, SAP Enterprise One, Pointman, and Made2Manage. Vyaire makes use of Salesforce as our CRM platform and the ServiceMax CRM add-on for managing area service capabilities. Vyaire developed a {custom} knowledge integration platform, iDataHub, powered by AWS providers reminiscent of AWS Glue, AWS Lambda, and Amazon API Gateway.

On this put up, we share how we extracted knowledge from SAP ERP utilizing AWS Glue and the SAP SDK.

Enterprise and technical challenges

Vyaire is engaged on deploying the sphere service administration resolution ServiceMax (SMAX, a natively constructed on SFDC ecosystem), providing options and providers that assist Vyaire’s Subject Companies crew enhance asset uptime with optimized in-person and distant service, enhance technician productiveness with the most recent cellular instruments, and ship metrics for assured decision-making.

A serious problem with ServiceMax implementation is constructing an information pipeline between ERP and the ServiceMax utility, exactly integrating pricing, orders, and first knowledge (product, buyer) from SAP ERP to ServiceMax utilizing Vyaire’s custom-built integration platform iDataHub.

Answer overview

Vyaire’s iDataHub powered by AWS Glue has been successfully used for knowledge motion between SAP ERP and ServiceMax.

AWS Glue a serverless knowledge integration service that makes it straightforward to find, put together, and mix knowledge for analytics, machine studying (ML), and utility growth. It’s utilized in Vyaire’s Enterprise iDatahub Platform for facilitating knowledge motion throughout completely different methods, nonetheless the main focus for this put up is to debate the mixing between SAP ERP and Salesforce SMAX.

The next diagram illustrates the mixing structure between Vyaire’s Salesforce ServiceMax and SAP ERP system.

Within the following sections, we stroll via establishing a connection to SAP ERP utilizing AWS Glue and the SAP SDK via distant operate calls. The high-level steps are as follows:

  1. Clone the PyRFC module from GitHub.
  2. Arrange the SAP SDK on an Amazon Elastic Compute Cloud (Amazon EC2) machine.
  3. Create the PyRFC wheel file.
  4. Merge SAP SDK information into the PyRFC wheel file.
  5. Check the reference to SAP utilizing the wheel file.


For this walkthrough, it’s best to have the next:

Clone the PyRFC module from GitHub

For directions for creating and connecting to an Amazon Linux 2 AMI EC2 occasion, confer with Tutorial: Get began with Amazon EC2 Linux situations.

The rationale we select Amazon Linux EC2 is to compile the SDK and PyRFC in a Linux surroundings, which is suitable with AWS Glue.

On the time of penning this put up, AWS Glue’s newest supported Python model is 3.7. Make sure that the Amazon EC2 Linux Python model and AWS Glue Python model are the identical. Within the following directions, we set up Python 3.7 in Amazon EC2; we will observe the identical directions to put in future variations of Python.

  1. Within the bash terminal of the EC2 occasion, run the next command:
sudo apt set up python3.7

  1. Log in to the Linux terminal, set up git, and clone the PyRFC module utilizing the next instructions:
ssh -i "aws-glue-ec2.pem" [email protected] 
mkdir aws_to_sap 
sudo yum set up git 
git clone https://github.com/SAP/PyRFC.git

Arrange the SAP SDK on an Amazon EC2 machine

To arrange the SAP SDK, full the next steps:

  1. Obtain the nwrfcsdk.zip file from a licensed SAP supply to your native machine.
  2. In a brand new terminal, run the next command on the EC2 occasion to repeat the nwrfcsdk.zip file out of your native machine to the aws_to_sap folder.
scp -i "aws-glue-ec2.pem" -r "c:nwrfcsdknwrfcsdk.zip" [email protected]:/residence/ec2-user/aws_to_sap/

  1. Unzip the nwrfcsdk.zip file within the present EC2 working listing and confirm the contents:

unzip nwrfcsdk.zip

  1. Configure the SAP SDK surroundings variable SAPNWRFC_HOME and confirm the contents:
export SAPNWRFC_HOME=/residence/ec2-user/aws_to_sap/nwrfcsdk

Create the PyRFC wheel file

Full the next steps to create your wheel file:

  1. On the EC2 occasion, set up Python modules cython and wheel for producing wheel information utilizing the next command:
pip3 set up cython, wheel

  1. Navigate to the PyRFC listing you created and run the next command to generate the wheel file:
python3 setup.py bdist_wheel

Confirm that the pyrfc-2.5.0-cp37-cp37m-linux_x86_64.whl wheel file is created as within the following screenshot within the PyRFC/dist folder. Be aware that you could be see a distinct wheel file title primarily based on the most recent PyRFC model on GitHub.

Merge SAP SDK information into the PyRFC wheel file

To merge the SAP SDK information, full the next steps:

  1. Unzip the wheel file you created:
cd dist
unzip pyrfc-2.5.0-cp37-cp37m-linux_x86_64.whl

  1. Copy the contents of lib (the SAP SDK information) to the pyrfc folder:
cd ..
cp ~/aws_to_sap/nwrfcsdk/lib/* pyrfc

Now you’ll be able to replace the rpath of the SAP SDK binaries utilizing the PatchELF utility, a easy utility for modifying current ELF executables and libraries.

  1. Set up the supporting dependencies (gcc, gcc-c++, python3-devel) for the Linux utility operate PatchELF:
sudo yum set up -y gcc gcc-c++ python3-devel

Obtain and set up PatchELF:

wget https://download-ib01.fedoraproject.org/pub/epel/7/x86_64/Packages/p/patchelf-0.12-1.el7.x86_64.rpm
sudo rpm -i patchelf-0.12-1.el7.x86_64.rpm

  1. Run patchelf:
discover -name '*.so' -exec patchelf --set-rpath '$ORIGIN' {} ;

  1. Replace the wheel file with the modified pyrfc and dist-info folders:
zip -r pyrfc-2.5.0-cp37-cp37m-linux_x86_64.whl pyrfc pyrfc-2.5.0.dist-info

  1. Copy the wheel file pyrfc-2.5.0-cp37-cp37m-linux_x86_64.whl from Amazon EC2 to Amazon Easy Storage Service (Amazon S3):
aws s3 cp /residence/ec2-user/aws_to_sap/PyRFC/dist/ s3://<bucket_name> /ec2-dump --recursive

Check the reference to SAP utilizing the wheel file

The next is a working pattern code to check the connectivity between the SAP system and AWS Glue utilizing the wheel file.

  1. On the AWS Glue Studio console, select Jobs within the navigation pane.
  2. Choose Spark script editor and select Create.

  1. Overwrite the boilerplate code with the next code on the Script tab:
import os, sys, pyrfc
os.environ['LD_LIBRARY_PATH'] = os.path.dirname(pyrfc.__file__)
os.execv('/usr/bin/python3', ['/usr/bin/python3', '-c', """
    from pyrfc import Connection
    import pandas as pd
    ## Variable declarations
    sap_table="" # SAP Table Name
    fields="" # List of fields required to be pulled
    options="" # the WHERE clause of the query is called "options"
    max_rows="" # MaxRows
    from_row = '' # Row of data origination
        # Establish SAP RFC connection
        conn = Connection(ashost="", sysnr="", client="", user="", passwd='') 
        print(f“SAP Connection successful – connection object: {conn}”)
        if conn:
            # Read SAP Table information
            tables = conn.call("RFC_READ_TABLE", QUERY_TABLE=sap_table, DELIMITER='|', FIELDS=fields, OPTIONS=options, ROWCOUNT=max_rows, ROWSKIPS=from_row) 
            # Access specific row & column information from the SAP Data 
            data = tables["DATA"] # pull the info a part of the consequence set
            columns = tables["FIELDS"] # pull the sphere title a part of the consequence set
            df = pd.DataFrame(knowledge, columns = columns)
            if df:
                print(f“Efficiently extracted knowledge from SAP utilizing {custom} RFC - Printing the highest 5 rows: {df.head(5)}”) 
                print(“No knowledge returned from the request. Please verify database/schema particulars”)
            print(“Unable to attach with SAP. Please verify connection particulars”)
    besides Exception as e:
        print(f“An exception occurred whereas connecting with SAP system: {e.args}”)

  1. On the Job particulars tab, fill in obligatory fields.
  2. Within the Superior properties part, present the S3 URI of the wheel file within the Job parameters part as a key worth pair:
    1. Key--additional-python-modules
    2. Worths3://<bucket_name>/ec2-dump/pyrfc-2.5.0-cp37-cp37m-linux_x86_64.whl (present your S3 bucket title)

  1. Save the job and select Run.

Confirm SAP connectivity

Full the next steps to confirm SAP connectivity:

  1. When the job run is full, navigate to the Runs tab on the Jobs web page and select Output logs within the logs part.
  2. Select the job_id and open the detailed logs.
  3. Observe the message SAP Connection profitable – connection object: <connection object>, which confirms a profitable reference to the SAP system.
  4. Observe the message Efficiently extracted knowledge from SAP utilizing {custom} RFC – Printing the highest 5 rows, which confirms profitable entry of knowledge from the SAP system.


AWS Glue facilitated the info extraction, transformation, and loading course of from completely different ERPs into Salesforce SMAX to enhance Vyaire’s merchandise and its associated info visibility to service technicians and tech help customers.

On this put up, you discovered how you should use AWS Glue to hook up with SAP ERP using SAP SDK distant capabilities. To study extra about AWS Glue, try AWS Glue Documentation.

In regards to the Authors

Siva Manickam is the Director of Enterprise Structure, Integrations, Digital Analysis & Improvement at Vyaire Medical Inc. On this function, Mr. Manickam is answerable for the corporate’s company capabilities (Enterprise Structure, Enterprise Integrations, Information Engineering) and produce operate (Digital Innovation Analysis and Improvement).

Prahalathan M is the Information Integration Architect at Vyaire Medical Inc. On this function, he’s answerable for end-to-end enterprise options design, structure, and modernization of integrations and knowledge platforms utilizing AWS cloud-native providers.

Deenbandhu Prasad is a Senior Analytics Specialist at AWS, specializing in massive knowledge providers. He’s captivated with serving to clients construct fashionable knowledge structure on the AWS Cloud. He has helped clients of all sizes implement knowledge administration, knowledge warehouse, and knowledge lake options.



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments