RGF (Regularized Greedy Forest)


The RGF package is a wrapper of the Regularized Greedy Forest (RGF) python package, which also includes a Multi-core implementation (FastRGF). More details on the functionality of the RGF package can be found in the package Documentation and Vignette.


References:

Rie Johnson and Tong Zhang, Learning Nonlinear Functions Using Regularized Greedy Forest

https://github.com/fukatani/rgf_python

https://github.com/baidu/fast_rgf


System Requirements



All modules should be installed in the default python configuration (the configuration that the R-session displays as default), otherwise errors will occur during the RGF package installation (reticulate::py_discover_config() might be useful here).


The installation notes for Linux, Macintosh, Windows are based on Python 2.7.


Debian/Ubuntu/Fedora


First install / upgrade the dependencies,


sudo pip install --upgrade pip setuptools

sudo pip install -U numpy

sudo pip install --upgrade scipy

sudo pip install -U scikit-learn


Then, download both rgf and fast_rgf by opening a console and giving,



git clone --recursive https://github.com/fukatani/rgf_python.git


By using the recursive parameter the fast_rgf folder will be included in the downoladed data. Then install both rgf and fast_rgf using the following command:


cd rgf_python

sudo python setup.py install 


FastRGF will be installed successfully only if gcc >= 5.0.



Macintosh OSX


First do a fresh install of python using brew [ normally the brew-python will appear as python2, because python comes by default in Macintosh OS ]


brew install python

brew link --overwrite python


Then update the .bash_profile file in home directory with the following paths


export PATH=/usr/local/bin:/usr/bin:$PATH

export PATH="/usr/local/opt/python/libexec/bin:$PATH"


Then install the dependencies for RGF and FastRGF


sudo pip2 install --upgrade pip setuptools

sudo pip2 install -U numpy

sudo pip2 install --upgrade scipy

sudo pip2 install -U scikit-learn


In case that an error appears for any of the previous installation commands, run one (or all) of the following (depending on which module wasn’t correctly installed),


sudo pip2 install --upgrade --ignore-installed --install-option '--install-data=/usr/local' numpy

sudo pip2 install --upgrade --ignore-installed --install-option '--install-data=/usr/local' scipy

sudo pip2 install --upgrade --ignore-installed --install-option '--install-data=/usr/local' scikit-learn


The FastRGF module requires a gcc >= 5.0. To install gcc-7 with brew do,


brew install gcc

brew link --overwrite gcc

export CXX=g++-7 && export CC=gcc-7


then create a symbolic link


cd /usr/local/bin

ln -s gcc-7 gcc


Then continue with the installation of fast_rgf and rgf python,



git clone --recursive https://github.com/fukatani/rgf_python.git


cd rgf_python/include/fast_rgf/build

export CXX=/usr/local/bin/g++-7 && export CC=/usr/local/bin/gcc-7

cmake rgf_python/include/fast_rgf

make

sudo make install


cd

cd rgf_python/include/rgf/build

export CXX=/usr/local/bin/g++-7 && export CC=/usr/local/bin/gcc-7

cmake rgf_python/include/rgf

make

sudo make install


Then install both rgf and fast-rgf using the following command,


cd 

cd rgf_python

sudo python2 setup.py install                 


After a successful rgf-python installation the user should open an R session and give the following reticulate command to change to the relevant (brew-python) directory (otherwise the RGF package won’t work properly),



reticulate::use_python('/usr/local/bin/python2')


and then,



reticulate::py_discover_config()


to validate that a user is in the python version where RGF or FastRGF are installed. Then,



install.packages(RGF)


library(RGF)


to load the R package. It is possible that the following warning in the R session appears if FastRGF is not installed,



UserWarning: Cannot find FastRGF executable files. FastRGF estimators will be unavailable for usage.
  warnings.warn("Cannot find FastRGF executable files. FastRGF estimators will be unavailable for usage.")
  



Windows OS


First download of get-pip.py for windows


Update the Environment variables ( Control Panel >> System and Security >> System >> Advanced system settings >> Environment variables >> System variables >> Path >> Edit ) by adding ( for instance in case of python 2.7 ),



C:\Python27;C:\Python27\Scripts


Install the Visual C++ 2015 Build Tools


Open the Command prompt (console) and install the rgf_python dependencies,



pip install --upgrade pip setuptools

pip install -U numpy

pip install --upgrade scipy

pip install -U scikit-learn


Then download git for windows,



https://git-scm.com/download/win


and run the downloaded .exe file. Then do,



git clone --recursive https://github.com/fukatani/rgf_python.git


FastRGF requires a gcc version > 5.0 . To find out the gcc version, open a command prompt (console) and type,



gcc --version


Installation / Upgrade of MinGW


Perform the following steps to upgrade the MinGW (so that simple RGF functions work – but not FastRGF)


Normally MinGW is installed in the C:\ directory. So, first delete the folder C:\MinGW (if it already exists), and then remove the environment variable from (Control Panel >> System and Security >> System >> Advanced system settings >> Environment variables >> System variables >> Path >> Edit) which usually is C:\MinGW\bin. Then download the most recent version of MinGW, and especially the mingw-get-setup.exe which is an automated GUI installer assistant. After the new version is installed successfully, update the environment variable by adding C:\MinGW\bin in (Control Panel >> System and Security >> System >> Advanced system settings >> Environment variables >> System variables >> Path >> Edit). Then open a new command prompt (console) and type,



gcc --version


to find out if the new version of MinGW is installed properly.


A word of caution, If Rtools is already installed then make sure that it does not point to an older version of gcc. Just observe the Path field of the environment variables (accessible as explained previously).


Perform the following steps only in case that a FastRGF installation is desired and gcc version is < 5.0


FastRGF works only with MinGW-w64 because only this version provides POSIX threads. It can be downloaded from MingW-W64-builds. After a successful download and installation the user should also update the environment variables field in (Control Panel >> System and Security >> System >> Advanced system settings >> Environment variables >> System variables >> Path >> Edit) by adding the following path (assuming the software is installed in C:\Program Files (x86) folder),



C:\Program Files (x86)\mingw-w64\i686-7.2.0-posix-dwarf-rt_v5-rev1\mingw32\bin


Installation of cmake


First download cmake for Windows, win64-x64 Installer. Once the file is downloaded run the .exe file and during installation make sure to add CMake to the system PATH for all users.


Before the installation of rgf I might have to remove Rtools environment variables, such as C:\Rtools\bin (accessible as explained previously), otherwise errors might occur.


installation of rgf


Open a console with administrator privileges (right click on cmd and run as administrator), then do



cd rgf_python/include/rgf/build

cmake ../ -G "MinGW Makefiles"

mingw32-make

mingw32-make install

cd C:\


installation of fast_rgf



cd rgf_python/include/fast_rgf/build

cmake .. -G "MinGW Makefiles"

mingw32-make

mingw32-make install

cd C:\


Installation (of both) in python



cd rgf_python

python setup.py install


Then open a command prompt (console) and type,



python 


to launch Python and then type



import rgf


to observe if rgf is installed properly. Then continue with the installation of the RGF package,



install.packages(RGF)


On windows the user can take advantage of the RGF package currently only from within the command prompt (console). First, find the full path of the installation location of R (possible if someone right-clicks in the R short-cut (probably on Desktop) and navigates to properties >> shortcut >> target). In my OS, for instance, R is located in C:\Program Files\R\R-3.4.0\bin\x64\R. Then, by opening a command prompt (console) and giving (for instance in my case),



cd C:\Program Files\R\R-3.4.0\bin\x64\

R

library(RGF)


one can proceed with the usage of the RGF package.


Installation of the RGF package


To install the package from CRAN use,



install.packages('RGF')


and to download the latest version from Github use the install_github function of the devtools package,


devtools::install_github(repo = 'mlampros/RGF')


Use the following link to report bugs/issues,

https://github.com/mlampros/RGF/issues