Question :
I need to create an occurrence count column of a present value in another column. As, for example, count in column “y” of elements present in column “x”:
x y
1 A 1
2 B 1
3 A 2
4 C 1
5 B 2
6 A 3
Probably have to create some loop, but I could not develop something efficient.
Usually only the final result of the counter (as an even occurrence counter) is considered. However, I need some counter to “save” all the steps of the count, to create an identification number for each occurrence.
Answer :
There is no need to create a loop. You can solve this problem by using the dplyr
package:
dados <- structure(list(x = structure(c(1L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 2L, 2L),
.Label = c("A", "B", "C"), class = "factor")), .Names = "x",
row.names = c(NA, -10L), class = "data.frame")
library(dplyr)
dados %>%
group_by(x) %>%
mutate(y = 1:n())
# A tibble: 10 x 2
# Groups: x [3]
x y
<fctr> <int>
1 A 1
2 C 1
3 B 1
4 B 2
5 B 3
6 C 2
7 C 3
8 C 4
9 B 4
10 B 5
Your problem is quiet, and as Marcus said, dplyr
accounts for the message. But I found his solution not very general.
The following code counts the occurrences of x
in each group of y
(note that I slightly changed your array to get a count greater than 1).
df <-
data.frame(
x = c('A', 'B', 'A','C','B','A', 'A'),
y = c(1,1,2,1,2,3,1)
)
df %>%
group_by(y, x) %>%
count()
Resulting in:
# A tibble: 6 x 3
# Groups: y, x [6]
y x n
<dbl> <fctr> <int>
1 1 A 2
2 1 B 1
3 1 C 1
4 2 A 1
5 2 B 1
6 3 A 1
Another way to count the elements of a group is to use the function n()
within a summarise
:
df %>%
group_by(y, x) %>%
summarise(contagem = n())
The result is the same as the previous one.
If you need to separate the table into several smaller tables, according to the y values, you can do this:
df %>%
group_by(y, x) %>%
count %>%
split(.$y)
Resulting from a list of tibbles
(easily convertible to data frames
):
$'1'
# A tibble: 3 x 3
# Groups: y, x [3]
y x n
<dbl> <fctr> <int>
1 1 A 2
2 1 B 1
3 1 C 1
$'2'
# A tibble: 2 x 3
# Groups: y, x [2]
y x n
<dbl> <fctr> <int>
1 2 A 1
2 2 B 1
$'3'
# A tibble: 1 x 3
# Groups: y, x [1]
y x n
<dbl> <fctr> <int>
1 3 A 1