R Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric (3 Examples) (2024)

This tutorial shows how to debug the “Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric” in the R programming language.

The post is structured as follows:

1) Creating Example Data

2) Example 1: Reproduce the Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric

3) Example 2: Fix the Error by Removing Non-Numeric Columns

4) Example 3: Fix the Error by Converting Non-Numeric Columns to Numbers

Let’s dive into it:

Creating Example Data

Let’s first construct some exemplifying data:

set.seed(67932) # Create example data framedata <- data.frame(x1 = sample(LETTERS[1:3], 10, replace = TRUE), x2 = round(rnorm(10), 2), x3 = round(runif(10), 2))data # Print example data frame

R Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric (3 Examples) (1)

As you can see based on Table 1, our example data is a data frame and contains ten rows and three columns.

Example 1: Reproduce the Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric

Example 1 explains how to replicate the “Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric”.

Let’s assume that we want to apply a Principal Component Analysis based on these data.

Then, we might try to apply the prcomp function to our data as shown below:

prcomp(data) # Try to apply prcomp function# Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric

Unfortunately, the “Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric” is returned.

The reason for this error message is that our data frame contains the column x1 which has the character class (the same error would appear in case of factor columns).

So how could we fix that? There are basically two alternatives, and I’m going to explain these alternatives in the following examples.

Keep on reading!

Example 2: Fix the Error by Removing Non-Numeric Columns

In this example, I’ll demonstrate how to drop all non-numeric variables from a data frame to avoid the “Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric”.

We can use the unlist, lapply, and is.numeric functions to create such a data frame subset:

data_new1 <- data[ , unlist(lapply(data, # Remove non-numeric columns is.numeric))]data_new1 # Print updated data frame

R Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric (3 Examples) (2)

As shown in Table 2, we have created a new data frame with the previous R syntax. This data frame contains only the two numeric columns x2 and x3.

Next, we can apply the prcomp function to these data:

prcomp(data_new1) # Apply prcomp function# Standard deviations (1, .., p=2):# [1] 1.2283189 0.2428404# # Rotation (n x k) = (2 x 2):# PC1 PC2# x2 0.99647810 0.08385344# x3 -0.08385344 0.99647810

Works fine!

Example 3: Fix the Error by Converting Non-Numeric Columns to Numbers

Example 3 demonstrates how to convert non-numeric categorical data to numeric data in order to get rid of the “Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric”.

To accomplish this, we have to apply the as.numeric and as.factor functions to our non-numeric data frame column x1:

data_new2 <- data # Duplicate data framedata_new2$x1 <- as.numeric(as.factor(data_new2$x1)) # Convert categories to numbersdata_new2 # Print updated data frame

R Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric (3 Examples) (3)

Table 3 shows the output of the previous code: We have transformed the categorical variable x1 into numbers.

Now, we can apply the prcomp function without any problems:

prcomp(data_new2) # Apply prcomp function# Standard deviations (1, .., p=3):# [1] 1.2734878 0.6608866 0.2316053# # Rotation (n x k) = (3 x 3):# PC1 PC2 PC3# x1 -0.3082818 0.9444851 -0.11362327# x2 0.9471298 0.3158997 0.05614757# x3 -0.0889241 0.0903067 0.99193609

Video & Further Resources

Have a look at the following video on my YouTube channel. In the video, I’m showing the content of this tutorial.

Furthermore, you may read the related tutorials on my website. A selection of articles can be found here.

  • Error : ‘names’ attribute must be the same length as the vector
  • Error in as.Date.numeric(X) : ‘origin’ must be supplied
  • Error in sort.int(x, na.last, decreasing, …) : ‘x’ must be atomic
  • Error in hist.default : ‘x’ must be numeric
  • Introduction to R Programming

Summary: You have learned in this article how to avoid the “Error in colMeans(x, na.rm = TRUE) : ‘x’ must be numeric” in R programming. Let me know in the comments section below, in case you have further questions.

Leave a Reply

I’m Joachim Schork. On this website, I provide statistics tutorials as well as code in Python and R programming.

Statistics Globe Newsletter

Related Tutorials

R Error in scan: Line 1 did not have X Elements (3 Examples)

ggplot2 Error in R: Must be Data Frame not S3 Object with Class Uneval

R Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric (3 Examples) (2024)

References

Top Articles
Latest Posts
Article information

Author: Prof. An Powlowski

Last Updated:

Views: 5672

Rating: 4.3 / 5 (64 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Prof. An Powlowski

Birthday: 1992-09-29

Address: Apt. 994 8891 Orval Hill, Brittnyburgh, AZ 41023-0398

Phone: +26417467956738

Job: District Marketing Strategist

Hobby: Embroidery, Bodybuilding, Motor sports, Amateur radio, Wood carving, Whittling, Air sports

Introduction: My name is Prof. An Powlowski, I am a charming, helpful, attractive, good, graceful, thoughtful, vast person who loves writing and wants to share my knowledge and understanding with you.