SparkR :: gapply How to use LinearRegression across groups in DataFrame?
€2-6 EUR / ชั่วโมง
Hi there
I have big data which I am using for applying linear model to each group. I have small example of the data for the principle I want to have parallelised.
# Determine six waiting times with the largest eruption time in minutes.
schema <- structType(structField("waiting", "double"), structField("max_eruption", "double"))
result <- gapply(
df,
"waiting",
function(key, x) {
y <- [login to view URL](key, max(x$eruptions))
},
schema)
head(collect(arrange(result, "max_eruption", decreasing = TRUE)))
หมายเลขโปรเจค: #30580205
เกี่ยวกับโปรเจกต์
freelancer freelancer 4 คน กำลังเสนอราคาในงานนี้ โดยมีราคาเฉลี่ยอยู่ที่ €10/ชั่วโมง
Hi I am a professional statistician with 5 years of experience. I have read the job description. I will help you complete the project. i have skills in Data Mining and R Programming Language. I can deliver quality an เพิ่มเติม
Hi, I have a big experience on R programming also I am a master's degree in data science. You can see my profile and my reviews to prove to you that I worked well on R projects. Your project is a challenge for me. Le เพิ่มเติม
Hi, I graduated Bachelor of Statistics. I have experience using R because that application have been learned when i was college. I am also a specialist in Basic Statistical Analysis (descriptive analysis, graph, chart เพิ่มเติม