Shift values in columns up if there are missing values above












8















I have a data frame like this:



df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A", 
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))


which looks like this:



  id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600


I would like to shift values in columns up if there are missing values above (by group). The result should look like this:



  id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600


I have no idea how to do this. Any thoughts?










share|improve this question









New contributor




Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Should "NA" be NA (i.e not a string?)

    – Khaynes
    yesterday








  • 1





    Edit your question. NA is currently a string.

    – NelsonGon
    yesterday






  • 1





    Are your numbers really of class character?

    – LAP
    yesterday
















8















I have a data frame like this:



df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A", 
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))


which looks like this:



  id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600


I would like to shift values in columns up if there are missing values above (by group). The result should look like this:



  id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600


I have no idea how to do this. Any thoughts?










share|improve this question









New contributor




Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Should "NA" be NA (i.e not a string?)

    – Khaynes
    yesterday








  • 1





    Edit your question. NA is currently a string.

    – NelsonGon
    yesterday






  • 1





    Are your numbers really of class character?

    – LAP
    yesterday














8












8








8








I have a data frame like this:



df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A", 
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))


which looks like this:



  id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600


I would like to shift values in columns up if there are missing values above (by group). The result should look like this:



  id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600


I have no idea how to do this. Any thoughts?










share|improve this question









New contributor




Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












I have a data frame like this:



df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A", 
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))


which looks like this:



  id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600


I would like to shift values in columns up if there are missing values above (by group). The result should look like this:



  id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600


I have no idea how to do this. Any thoughts?







r dataframe






share|improve this question









New contributor




Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited yesterday









Ronak Shah

34.1k103854




34.1k103854






New contributor




Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked yesterday









RuilinRuilin

442




442




New contributor




Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Ruilin is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • Should "NA" be NA (i.e not a string?)

    – Khaynes
    yesterday








  • 1





    Edit your question. NA is currently a string.

    – NelsonGon
    yesterday






  • 1





    Are your numbers really of class character?

    – LAP
    yesterday



















  • Should "NA" be NA (i.e not a string?)

    – Khaynes
    yesterday








  • 1





    Edit your question. NA is currently a string.

    – NelsonGon
    yesterday






  • 1





    Are your numbers really of class character?

    – LAP
    yesterday

















Should "NA" be NA (i.e not a string?)

– Khaynes
yesterday







Should "NA" be NA (i.e not a string?)

– Khaynes
yesterday






1




1





Edit your question. NA is currently a string.

– NelsonGon
yesterday





Edit your question. NA is currently a string.

– NelsonGon
yesterday




1




1





Are your numbers really of class character?

– LAP
yesterday





Are your numbers really of class character?

– LAP
yesterday












3 Answers
3






active

oldest

votes


















5














Here is a rough concept using data.table that can be refined:



library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}

setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]

id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600





share|improve this answer
























  • Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!

    – Ruilin
    yesterday






  • 1





    In the last step, specify the .SDcols and you can use !Reduce('&', .SD) to create the logical vector

    – akrun
    yesterday











  • Also if it is all the columns except the id you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)].

    – snoram
    yesterday



















4














Don't think this is the most efficient way to do it but one option



library(rowr)

df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1


# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600


We split the dataframe into list of dataframe for every id and for each dataframe we remove the NA values using na.omit and use cbind.fill to fill the values with NA and finally merge the list of dataframes back into one using rbind with do.call.






share|improve this answer


























  • Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!

    – Ruilin
    yesterday











  • @Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.

    – Ronak Shah
    yesterday





















3














Here is an option with data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'id', order the other column based on the NA values, then create an index to remove the rows where all the elements are NA



library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600




Or using the same logic with tidyverse. Grouped by 'id', change the order or elements in all other column with mutate_all by ordering on the logical vector (is.na(column)) and keep the rows having at least one non-NA (filter_at)



library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600




Ordering a vector/column based on logical indexing is simple.



v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6


use that index to change the order of values



v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA





share|improve this answer





















  • 1





    Thank you for the beautiful answer. Could you please explain this part ` mutate_all(funs(.[order(is.na(.))])) %>% in this answer

    – amrrs
    yesterday













  • @amrrs Updated the answer. Hopefully, it helps

    – akrun
    yesterday











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});






Ruilin is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54141835%2fshift-values-in-columns-up-if-there-are-missing-values-above%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









5














Here is a rough concept using data.table that can be refined:



library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}

setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]

id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600





share|improve this answer
























  • Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!

    – Ruilin
    yesterday






  • 1





    In the last step, specify the .SDcols and you can use !Reduce('&', .SD) to create the logical vector

    – akrun
    yesterday











  • Also if it is all the columns except the id you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)].

    – snoram
    yesterday
















5














Here is a rough concept using data.table that can be refined:



library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}

setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]

id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600





share|improve this answer
























  • Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!

    – Ruilin
    yesterday






  • 1





    In the last step, specify the .SDcols and you can use !Reduce('&', .SD) to create the logical vector

    – akrun
    yesterday











  • Also if it is all the columns except the id you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)].

    – snoram
    yesterday














5












5








5







Here is a rough concept using data.table that can be refined:



library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}

setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]

id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600





share|improve this answer













Here is a rough concept using data.table that can be refined:



library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}

setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]

id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600






share|improve this answer












share|improve this answer



share|improve this answer










answered yesterday









snoramsnoram

6,549831




6,549831













  • Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!

    – Ruilin
    yesterday






  • 1





    In the last step, specify the .SDcols and you can use !Reduce('&', .SD) to create the logical vector

    – akrun
    yesterday











  • Also if it is all the columns except the id you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)].

    – snoram
    yesterday



















  • Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!

    – Ruilin
    yesterday






  • 1





    In the last step, specify the .SDcols and you can use !Reduce('&', .SD) to create the logical vector

    – akrun
    yesterday











  • Also if it is all the columns except the id you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)].

    – snoram
    yesterday

















Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!

– Ruilin
yesterday





Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!

– Ruilin
yesterday




1




1





In the last step, specify the .SDcols and you can use !Reduce('&', .SD) to create the logical vector

– akrun
yesterday





In the last step, specify the .SDcols and you can use !Reduce('&', .SD) to create the logical vector

– akrun
yesterday













Also if it is all the columns except the id you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)].

– snoram
yesterday





Also if it is all the columns except the id you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)].

– snoram
yesterday













4














Don't think this is the most efficient way to do it but one option



library(rowr)

df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1


# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600


We split the dataframe into list of dataframe for every id and for each dataframe we remove the NA values using na.omit and use cbind.fill to fill the values with NA and finally merge the list of dataframes back into one using rbind with do.call.






share|improve this answer


























  • Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!

    – Ruilin
    yesterday











  • @Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.

    – Ronak Shah
    yesterday


















4














Don't think this is the most efficient way to do it but one option



library(rowr)

df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1


# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600


We split the dataframe into list of dataframe for every id and for each dataframe we remove the NA values using na.omit and use cbind.fill to fill the values with NA and finally merge the list of dataframes back into one using rbind with do.call.






share|improve this answer


























  • Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!

    – Ruilin
    yesterday











  • @Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.

    – Ronak Shah
    yesterday
















4












4








4







Don't think this is the most efficient way to do it but one option



library(rowr)

df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1


# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600


We split the dataframe into list of dataframe for every id and for each dataframe we remove the NA values using na.omit and use cbind.fill to fill the values with NA and finally merge the list of dataframes back into one using rbind with do.call.






share|improve this answer















Don't think this is the most efficient way to do it but one option



library(rowr)

df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1


# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600


We split the dataframe into list of dataframe for every id and for each dataframe we remove the NA values using na.omit and use cbind.fill to fill the values with NA and finally merge the list of dataframes back into one using rbind with do.call.







share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered yesterday









Ronak ShahRonak Shah

34.1k103854




34.1k103854













  • Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!

    – Ruilin
    yesterday











  • @Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.

    – Ronak Shah
    yesterday





















  • Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!

    – Ruilin
    yesterday











  • @Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.

    – Ronak Shah
    yesterday



















Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!

– Ruilin
yesterday





Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!

– Ruilin
yesterday













@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.

– Ronak Shah
yesterday







@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.

– Ronak Shah
yesterday













3














Here is an option with data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'id', order the other column based on the NA values, then create an index to remove the rows where all the elements are NA



library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600




Or using the same logic with tidyverse. Grouped by 'id', change the order or elements in all other column with mutate_all by ordering on the logical vector (is.na(column)) and keep the rows having at least one non-NA (filter_at)



library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600




Ordering a vector/column based on logical indexing is simple.



v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6


use that index to change the order of values



v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA





share|improve this answer





















  • 1





    Thank you for the beautiful answer. Could you please explain this part ` mutate_all(funs(.[order(is.na(.))])) %>% in this answer

    – amrrs
    yesterday













  • @amrrs Updated the answer. Hopefully, it helps

    – akrun
    yesterday
















3














Here is an option with data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'id', order the other column based on the NA values, then create an index to remove the rows where all the elements are NA



library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600




Or using the same logic with tidyverse. Grouped by 'id', change the order or elements in all other column with mutate_all by ordering on the logical vector (is.na(column)) and keep the rows having at least one non-NA (filter_at)



library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600




Ordering a vector/column based on logical indexing is simple.



v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6


use that index to change the order of values



v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA





share|improve this answer





















  • 1





    Thank you for the beautiful answer. Could you please explain this part ` mutate_all(funs(.[order(is.na(.))])) %>% in this answer

    – amrrs
    yesterday













  • @amrrs Updated the answer. Hopefully, it helps

    – akrun
    yesterday














3












3








3







Here is an option with data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'id', order the other column based on the NA values, then create an index to remove the rows where all the elements are NA



library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600




Or using the same logic with tidyverse. Grouped by 'id', change the order or elements in all other column with mutate_all by ordering on the logical vector (is.na(column)) and keep the rows having at least one non-NA (filter_at)



library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600




Ordering a vector/column based on logical indexing is simple.



v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6


use that index to change the order of values



v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA





share|improve this answer















Here is an option with data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'id', order the other column based on the NA values, then create an index to remove the rows where all the elements are NA



library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600




Or using the same logic with tidyverse. Grouped by 'id', change the order or elements in all other column with mutate_all by ordering on the logical vector (is.na(column)) and keep the rows having at least one non-NA (filter_at)



library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600




Ordering a vector/column based on logical indexing is simple.



v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6


use that index to change the order of values



v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA






share|improve this answer














share|improve this answer



share|improve this answer








edited yesterday

























answered yesterday









akrunakrun

400k13190265




400k13190265








  • 1





    Thank you for the beautiful answer. Could you please explain this part ` mutate_all(funs(.[order(is.na(.))])) %>% in this answer

    – amrrs
    yesterday













  • @amrrs Updated the answer. Hopefully, it helps

    – akrun
    yesterday














  • 1





    Thank you for the beautiful answer. Could you please explain this part ` mutate_all(funs(.[order(is.na(.))])) %>% in this answer

    – amrrs
    yesterday













  • @amrrs Updated the answer. Hopefully, it helps

    – akrun
    yesterday








1




1





Thank you for the beautiful answer. Could you please explain this part ` mutate_all(funs(.[order(is.na(.))])) %>% in this answer

– amrrs
yesterday







Thank you for the beautiful answer. Could you please explain this part ` mutate_all(funs(.[order(is.na(.))])) %>% in this answer

– amrrs
yesterday















@amrrs Updated the answer. Hopefully, it helps

– akrun
yesterday





@amrrs Updated the answer. Hopefully, it helps

– akrun
yesterday










Ruilin is a new contributor. Be nice, and check out our Code of Conduct.










draft saved

draft discarded


















Ruilin is a new contributor. Be nice, and check out our Code of Conduct.













Ruilin is a new contributor. Be nice, and check out our Code of Conduct.












Ruilin is a new contributor. Be nice, and check out our Code of Conduct.
















Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54141835%2fshift-values-in-columns-up-if-there-are-missing-values-above%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Сан-Квентин

Алькесар

Josef Freinademetz