install.packages("reticulate")
library(reticulate)
Getting Started with Python using R and reticulate
Want to use Python’s powerful libraries without leaving R? The reticulate package gives you the best of both worlds - R’s elegant data handling and visualization with Python’s machine learning and scientific computing tools. This post shows you how to set up and use this powerful bridge between languages.
Quick Setup in 4 Steps
1. Install reticulate
2. Install Python via Miniconda
The easiest approach is to let reticulate handle Python installation for you:
install_miniconda(path = "c:/miniconda")
3. Connect to Python
Reticulate creates a default environment called r-reticulate
. Let’s connect to it:
# Check available environments
conda_list()
# Connect to the default environment
use_condaenv("r-reticulate")
4. Install Python Packages
Now you can install any Python packages you need:
py_install(c("pandas", "scikit-learn", "matplotlib"))
Three Ways to Use Python in R
1. Import Python Modules Directly
# Import pandas and use it like any R package
<- import("pandas")
pd
# Create a pandas Series
$Series(c(1, 2, 3, 4, 5))
pd
# Import numpy for numerical operations
<- import("numpy")
np $mean(c(1:100)) # Calculate mean using numpy np
2. Write Python Code in R Markdown
You can mix R and Python code in the same document by using Python code chunks:
# This is Python code!
import pandas as pd
import numpy as np
# Create a simple DataFrame
= pd.DataFrame({
df 'A': np.random.randn(5),
'B': np.random.randn(5)
})
print(df.describe())
3. Use Python Libraries in R Workflows
The most powerful approach is using Python’s machine learning libraries within R:
# Import scikit-learn
<- import("sklearn.linear_model")
sk
# Create and fit a linear regression model
<- sk$LinearRegression()
model $fit(X = as.matrix(mtcars[, c("disp", "hp", "wt")]),
modely = mtcars$mpg)
# Get predictions and coefficients
<- model$predict(as.matrix(mtcars[, c("disp", "hp", "wt")]))
predictions <- data.frame(
coefficients Feature = c("Intercept", "disp", "hp", "wt"),
Coefficient = c(model$intercept_, model$coef_)
)
coefficients
Real-World Applications
Here are some powerful ways to combine R and Python in your data science workflow:
Data Science Pipeline
# 1. Data cleaning with R's tidyverse
library(readr)
<- read_csv("data.csv") %>%
clean_data filter(!is.na(important_column)) %>%
mutate(new_feature = feature1 / feature2)
# 2. Machine learning with Python's scikit-learn
<- import("sklearn.ensemble")
sk <- sk$RandomForestClassifier(n_estimators=100)
model $fit(X = as.matrix(clean_data[, features]),
modely = clean_data$target)
# 3. Visualization with R's ggplot2
<- model$predict_proba(as.matrix(clean_data[, features]))[,2]
predictions %>%
clean_data mutate(prediction = predictions) %>%
ggplot(aes(x=feature1, y=feature2, color=prediction)) +
geom_point() +
scale_color_viridis_c()
When to Use Each Language
Use R for:
- Data manipulation with dplyr/data.table
- Statistical modeling and hypothesis testing
- Publication-quality visualization
- Interactive reports and dashboards
Use Python for:
- Deep learning with TensorFlow/PyTorch
- Natural language processing
- Computer vision
- Advanced machine learning algorithms
With reticulate, you don’t have to choose - use the best tool for each part of your analysis!