Profile your application using the Performance Estimator
==========================================================

.. image:: img/performance_estimator_guide_details.png
   :alt: Performance Estimator Guide

Huxelerate Performance Estimator does not require any source code upload online, to guarantee the privacy of your code.
To generate a profiling report, you need to slightly modify the way your source code is compiled, but do not worry, only 5 steps are required:
    
    #. Compile each C/C++ source file to LLVM Intermediate Representation (IR)
    #. Link all LLVM IR files to a single one
    #. Instrument your LLVM IR linked file
    #. Compile the instrumented LLVM IR file to obtain your executable
    #. Execute your application with a representative dataset to obtain an accurate estimate

Compile sources to LLVM Intermediate Representation (IR)
----------------------------------------------------------

We assume you have a Makefile to compile your application, if not, you can simply perform the same steps by running the commands manually.
First of all, we define a convenient variable inside the Makefile to run Huxelerate Performance Estimator Docker Container:

.. code-block:: bash

    HPE=docker run --rm -v $(shell pwd):/data huxelerate/performance_estimator

The ``/data`` directory is the default working directory inside Huxelerate Performance Estimator Docker Container.
As you can see, when running the Docker Container we map the current working directory 
(the one from which the makefile is run) to the ``/data`` directory inside the Docker Container.

With this command we assume that the Makefile is located in the main directory of your application containing 
all the source code and relevant files for compilation.
If this is not the case, you can simply change ``$(shell pwd)`` with a specific path in your file system.

Then you need to change your default compiler to the clang compiler inside the Docker Container:

.. code-block:: make

    CC= $(HPE) clang

Finally, to compile into LLVM IR, it is enough to add ``-emit-llvm`` flag during object generation:

.. code-block:: make
    
    %.o : %.c
        $(CC) $(OTHER_FLAGS) -c -emit-llvm $< -o $@

.. important::
    The Performance Estimator currently does not support profiling of C++ code that makes use of exceptions.
    To profile C++ code, make sure to compile it using the ``-fno-exceptions`` compiler flag.

.. note::
    When running commands in the context of the Docker Container, absolute paths will be different from your local machine!
    Whenever possible try to use relative paths when compiling your code. If you really need absolute paths, 
    make sure to change them so that they refer the same files inside the Docker Container.

Link all IR files to a single one
-----------------------------------
Link all LLVM IR files to a single LLVM IR file (`linked.ll`) using the ``llvm-link`` command:

.. code-block:: make
    
    linked.ll : $(OBJS)
        $(HPE) llvm-link $^ -o linked.ll

Instrument your linked file
----------------------------

Use ``hux-roofline-instrument`` command to instrument the `linked.ll` file. 
In this step, you also need to specify the function to profile using the ``--function`` flag:

.. code-block:: bash

    instrumented.ll : linked.ll
        $(HPE) hux-roofline-instrument -i linked.ll -o instrumented.ll --function my_function_to_profile

Compile to obtain your executable
----------------------------------

The last compiling step is to generate the executable from the `instrumented.ll` file. 
The command to do so, is no different from compiling any other object file. 
The only requirement here, is to make sure that the ``dl`` and ``stdc++`` libraries gets linked since they are needed for instrumentation purposes:

.. code-block:: make

    executable_name : instrumented.ll
        $(CC) $(OTHER_FLAGS) -ldl -lstdc++ instrumented.ll -o $@

Finally, if using a Makefile, a simple ``make`` command with your default option is enough to generate the specific executable file.

Execute your application with a representative dataset to allow a more accurate estimate
------------------------------------------------------------------------------------------------

For performance Estimation purposes, the execution of your application must be performed inside the Huxelerate Performance Estimator Docker Container:

.. code-block:: bash

    docker run --rm -v $(pwd):/data huxelerate/performance_estimator ./executable_name [executable_options]

The output of the execution is a profiling report file with `.hprof` extension that you need to upload `online <https://performance.huxelerate.it/performance-estimation>`_ to obtain the performance estimation.