Shift values in columns up if there are missing values above
I have a data frame like this:
df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))
which looks like this:
id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600
I would like to shift values in columns up if there are missing values above (by group). The result should look like this:
id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600
I have no idea how to do this. Any thoughts?
r dataframe
New contributor
add a comment |
I have a data frame like this:
df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))
which looks like this:
id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600
I would like to shift values in columns up if there are missing values above (by group). The result should look like this:
id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600
I have no idea how to do this. Any thoughts?
r dataframe
New contributor
Should "NA" be NA (i.e not a string?)
– Khaynes
yesterday
1
Edit your question. NA is currently a string.
– NelsonGon
yesterday
1
Are your numbers really of classcharacter
?
– LAP
yesterday
add a comment |
I have a data frame like this:
df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))
which looks like this:
id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600
I would like to shift values in columns up if there are missing values above (by group). The result should look like this:
id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600
I have no idea how to do this. Any thoughts?
r dataframe
New contributor
I have a data frame like this:
df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A",
"B", "B", "B", "B", "B", "B"),
var1 = c("100", "200", "300", NA, NA, NA, NA, NA,
"100", "200", "300", NA, NA, NA),
var2 = c("100", NA, NA, "400", "500", "600", NA, NA,
NA, NA, NA, "400", NA, NA),
var3 = c("200", NA, NA, NA, NA, NA, "700", "800",
"500", NA, NA, NA, "500", "600"))
which looks like this:
id var1 var2 var3
A 100 100 200
A 200 <NA> <NA>
A 300 <NA> <NA>
A <NA> 400 <NA>
A <NA> 500 <NA>
A <NA> 600 <NA>
A <NA> <NA> 700
A <NA> <NA> 800
B 100 <NA> 500
B 200 <NA> <NA>
B 300 <NA> <NA>
B <NA> 400 <NA>
B <NA> <NA> 500
B <NA> <NA> 600
I would like to shift values in columns up if there are missing values above (by group). The result should look like this:
id var1 var2 var3
A 100 100 200
A 200 400 700
A 300 500 800
A <NA> 600 <NA>
B 100 400 500
B 200 <NA> 500
B 300 <NA> 600
I have no idea how to do this. Any thoughts?
r dataframe
r dataframe
New contributor
New contributor
edited yesterday
Ronak Shah
34.1k103854
34.1k103854
New contributor
asked yesterday
RuilinRuilin
442
442
New contributor
New contributor
Should "NA" be NA (i.e not a string?)
– Khaynes
yesterday
1
Edit your question. NA is currently a string.
– NelsonGon
yesterday
1
Are your numbers really of classcharacter
?
– LAP
yesterday
add a comment |
Should "NA" be NA (i.e not a string?)
– Khaynes
yesterday
1
Edit your question. NA is currently a string.
– NelsonGon
yesterday
1
Are your numbers really of classcharacter
?
– LAP
yesterday
Should "NA" be NA (i.e not a string?)
– Khaynes
yesterday
Should "NA" be NA (i.e not a string?)
– Khaynes
yesterday
1
1
Edit your question. NA is currently a string.
– NelsonGon
yesterday
Edit your question. NA is currently a string.
– NelsonGon
yesterday
1
1
Are your numbers really of class
character
?– LAP
yesterday
Are your numbers really of class
character
?– LAP
yesterday
add a comment |
3 Answers
3
active
oldest
votes
Here is a rough concept using data.table
that can be refined:
library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}
setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]
id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600
Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!
– Ruilin
yesterday
1
In the last step, specify the.SDcols
and you can use!Reduce('&', .SD)
to create the logical vector
– akrun
yesterday
Also if it is all the columns except theid
you could do smth like:out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)]
.
– snoram
yesterday
add a comment |
Don't think this is the most efficient way to do it but one option
library(rowr)
df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1
# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600
We split
the dataframe into list of dataframe for every id
and for each dataframe we remove the NA
values using na.omit
and use cbind.fill
to fill the values with NA
and finally merge the list of dataframes back into one using rbind
with do.call
.
Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!
– Ruilin
yesterday
@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.
– Ronak Shah
yesterday
add a comment |
Here is an option with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'id', order
the other column based on the NA values, then create an index to remove the rows where all the elements are NA
library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600
Or using the same logic with tidyverse
. Grouped by 'id', change the order
or elements in all other column with mutate_all
by order
ing on the logical vector (is.na(column)
) and keep the rows having at least one non-NA (filter_at
)
library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600
Ordering a vector/column based on logical indexing is simple.
v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6
use that index to change the order of values
v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA
1
Thank you for the beautiful answer. Could you please explain this part `mutate_all(funs(.[order(is.na(.))])) %>%
in this answer
– amrrs
yesterday
@amrrs Updated the answer. Hopefully, it helps
– akrun
yesterday
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Ruilin is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54141835%2fshift-values-in-columns-up-if-there-are-missing-values-above%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is a rough concept using data.table
that can be refined:
library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}
setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]
id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600
Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!
– Ruilin
yesterday
1
In the last step, specify the.SDcols
and you can use!Reduce('&', .SD)
to create the logical vector
– akrun
yesterday
Also if it is all the columns except theid
you could do smth like:out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)]
.
– snoram
yesterday
add a comment |
Here is a rough concept using data.table
that can be refined:
library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}
setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]
id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600
Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!
– Ruilin
yesterday
1
In the last step, specify the.SDcols
and you can use!Reduce('&', .SD)
to create the logical vector
– akrun
yesterday
Also if it is all the columns except theid
you could do smth like:out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)]
.
– snoram
yesterday
add a comment |
Here is a rough concept using data.table
that can be refined:
library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}
setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]
id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600
Here is a rough concept using data.table
that can be refined:
library(data.table)
# Helper function:
shift_up <- function(x) {
n <- length(x)
x <- x[!is.na(x)]
length(x) <- n
x
}
setDT(df)
df[, lapply(.SD, shift_up), id][!(is.na(var1) & is.na(var2) & is.na(var3))]
id var1 var2 var3
1: A 100 100 200
2: A 200 400 700
3: A 300 500 800
4: A <NA> 600 <NA>
5: B 100 400 500
6: B 200 <NA> 500
7: B 300 <NA> 600
answered yesterday
snoramsnoram
6,549831
6,549831
Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!
– Ruilin
yesterday
1
In the last step, specify the.SDcols
and you can use!Reduce('&', .SD)
to create the logical vector
– akrun
yesterday
Also if it is all the columns except theid
you could do smth like:out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)]
.
– snoram
yesterday
add a comment |
Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!
– Ruilin
yesterday
1
In the last step, specify the.SDcols
and you can use!Reduce('&', .SD)
to create the logical vector
– akrun
yesterday
Also if it is all the columns except theid
you could do smth like:out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)]
.
– snoram
yesterday
Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!
– Ruilin
yesterday
Hi snoram! Thanks for helping me with this! Your codes work out well! I just want to know if there is a simple way to write "[!(is.na(var1) & is.na(var2) & is.na(var3))]". In my actual data frame, I have over 30 variables. It would be great if I don't have to write this one by one. Thanks again!
– Ruilin
yesterday
1
1
In the last step, specify the
.SDcols
and you can use !Reduce('&', .SD)
to create the logical vector– akrun
yesterday
In the last step, specify the
.SDcols
and you can use !Reduce('&', .SD)
to create the logical vector– akrun
yesterday
Also if it is all the columns except the
id
you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)]
.– snoram
yesterday
Also if it is all the columns except the
id
you could do smth like: out <- df[, lapply(.SD, shift_up), id]; out[rowSums(is.na(out[,!"id"])) < (length(out) - 1)]
.– snoram
yesterday
add a comment |
Don't think this is the most efficient way to do it but one option
library(rowr)
df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1
# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600
We split
the dataframe into list of dataframe for every id
and for each dataframe we remove the NA
values using na.omit
and use cbind.fill
to fill the values with NA
and finally merge the list of dataframes back into one using rbind
with do.call
.
Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!
– Ruilin
yesterday
@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.
– Ronak Shah
yesterday
add a comment |
Don't think this is the most efficient way to do it but one option
library(rowr)
df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1
# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600
We split
the dataframe into list of dataframe for every id
and for each dataframe we remove the NA
values using na.omit
and use cbind.fill
to fill the values with NA
and finally merge the list of dataframes back into one using rbind
with do.call
.
Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!
– Ruilin
yesterday
@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.
– Ronak Shah
yesterday
add a comment |
Don't think this is the most efficient way to do it but one option
library(rowr)
df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1
# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600
We split
the dataframe into list of dataframe for every id
and for each dataframe we remove the NA
values using na.omit
and use cbind.fill
to fill the values with NA
and finally merge the list of dataframes back into one using rbind
with do.call
.
Don't think this is the most efficient way to do it but one option
library(rowr)
df1 <- do.call(rbind, lapply(split(df, df$id), function(x) {
data.frame(id = x$id[1], do.call(cbind.fill,c(sapply(x[-1], na.omit),fill = NA)))
}))
names(df1) <- names(df)
df1
# id var1 var2 var3
#A.1 A 100 100 200
#A.2 A 200 400 700
#A.3 A 300 500 800
#A.4 A <NA> 600 <NA>
#B.1 B 100 400 500
#B.2 B 200 <NA> 500
#B.3 B 300 <NA> 600
We split
the dataframe into list of dataframe for every id
and for each dataframe we remove the NA
values using na.omit
and use cbind.fill
to fill the values with NA
and finally merge the list of dataframes back into one using rbind
with do.call
.
edited yesterday
answered yesterday
Ronak ShahRonak Shah
34.1k103854
34.1k103854
Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!
– Ruilin
yesterday
@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.
– Ronak Shah
yesterday
add a comment |
Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!
– Ruilin
yesterday
@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.
– Ronak Shah
yesterday
Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!
– Ruilin
yesterday
Hi Ronak! Thanks for the excellent code! Just wondering if there is a way to keep the column names unchanged? In my actual data frame, the columns names are years; it will be very confusing if I mess it up. Thanks again for your help!
– Ruilin
yesterday
@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.
– Ronak Shah
yesterday
@Ruilin yes, you can save the output into different object and rename it using the original dataframe. Updated the answer with that change.
– Ronak Shah
yesterday
add a comment |
Here is an option with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'id', order
the other column based on the NA values, then create an index to remove the rows where all the elements are NA
library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600
Or using the same logic with tidyverse
. Grouped by 'id', change the order
or elements in all other column with mutate_all
by order
ing on the logical vector (is.na(column)
) and keep the rows having at least one non-NA (filter_at
)
library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600
Ordering a vector/column based on logical indexing is simple.
v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6
use that index to change the order of values
v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA
1
Thank you for the beautiful answer. Could you please explain this part `mutate_all(funs(.[order(is.na(.))])) %>%
in this answer
– amrrs
yesterday
@amrrs Updated the answer. Hopefully, it helps
– akrun
yesterday
add a comment |
Here is an option with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'id', order
the other column based on the NA values, then create an index to remove the rows where all the elements are NA
library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600
Or using the same logic with tidyverse
. Grouped by 'id', change the order
or elements in all other column with mutate_all
by order
ing on the logical vector (is.na(column)
) and keep the rows having at least one non-NA (filter_at
)
library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600
Ordering a vector/column based on logical indexing is simple.
v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6
use that index to change the order of values
v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA
1
Thank you for the beautiful answer. Could you please explain this part `mutate_all(funs(.[order(is.na(.))])) %>%
in this answer
– amrrs
yesterday
@amrrs Updated the answer. Hopefully, it helps
– akrun
yesterday
add a comment |
Here is an option with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'id', order
the other column based on the NA values, then create an index to remove the rows where all the elements are NA
library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600
Or using the same logic with tidyverse
. Grouped by 'id', change the order
or elements in all other column with mutate_all
by order
ing on the logical vector (is.na(column)
) and keep the rows having at least one non-NA (filter_at
)
library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600
Ordering a vector/column based on logical indexing is simple.
v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6
use that index to change the order of values
v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA
Here is an option with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'id', order
the other column based on the NA values, then create an index to remove the rows where all the elements are NA
library(data.table)
df1 <- setDT(df)[, lapply(.SD, function(x) x[order(is.na(x))]), id]
df1[df1[,!Reduce(`&`, lapply(.SD, is.na)), .SDcols = var1:var3]]
# id var1 var2 var3
#1: A 100 100 200
#2: A 200 400 700
#3: A 300 500 800
#4: A <NA> 600 <NA>
#5: B 100 400 500
#6: B 200 <NA> 500
#7: B 300 <NA> 600
Or using the same logic with tidyverse
. Grouped by 'id', change the order
or elements in all other column with mutate_all
by order
ing on the logical vector (is.na(column)
) and keep the rows having at least one non-NA (filter_at
)
library(tidyverse)
df %>%
group_by(id) %>%
mutate_all(funs(.[order(is.na(.))])) %>%
filter_at(vars(var1:var3), any_vars(!is.na(.)))
# A tibble: 7 x 4
# Groups: id [2]
# id var1 var2 var3
# <fct> <fct> <fct> <fct>
#1 A 100 100 200
#2 A 200 400 700
#3 A 300 500 800
#4 A <NA> 600 <NA>
#5 B 100 400 500
#6 B 200 <NA> 500
#7 B 300 <NA> 600
Ordering a vector/column based on logical indexing is simple.
v1 <- c(1:3, NA, 5, NA, 7)
order(is.na(v1)) #gives the index of order
#[1] 1 2 3 5 7 4 6
use that index to change the order of values
v1[order(is.na(v1))]
#[1] 1 2 3 5 7 NA NA
edited yesterday
answered yesterday
akrunakrun
400k13190265
400k13190265
1
Thank you for the beautiful answer. Could you please explain this part `mutate_all(funs(.[order(is.na(.))])) %>%
in this answer
– amrrs
yesterday
@amrrs Updated the answer. Hopefully, it helps
– akrun
yesterday
add a comment |
1
Thank you for the beautiful answer. Could you please explain this part `mutate_all(funs(.[order(is.na(.))])) %>%
in this answer
– amrrs
yesterday
@amrrs Updated the answer. Hopefully, it helps
– akrun
yesterday
1
1
Thank you for the beautiful answer. Could you please explain this part `
mutate_all(funs(.[order(is.na(.))])) %>%
in this answer– amrrs
yesterday
Thank you for the beautiful answer. Could you please explain this part `
mutate_all(funs(.[order(is.na(.))])) %>%
in this answer– amrrs
yesterday
@amrrs Updated the answer. Hopefully, it helps
– akrun
yesterday
@amrrs Updated the answer. Hopefully, it helps
– akrun
yesterday
add a comment |
Ruilin is a new contributor. Be nice, and check out our Code of Conduct.
Ruilin is a new contributor. Be nice, and check out our Code of Conduct.
Ruilin is a new contributor. Be nice, and check out our Code of Conduct.
Ruilin is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54141835%2fshift-values-in-columns-up-if-there-are-missing-values-above%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Should "NA" be NA (i.e not a string?)
– Khaynes
yesterday
1
Edit your question. NA is currently a string.
– NelsonGon
yesterday
1
Are your numbers really of class
character
?– LAP
yesterday