Reading RDS Files and Valid Metadata Errors in Alteryx

Reading RDS Files and Valid Metadata Errors in Alteryx

A few days ago, a user on the Alteryx Community forums asked if Alteryx could read an RDS file. A valid question since .RDS isn’t an available file type in the Input tool. RDS files are R objects saved to a file which are easily restored with the function readRDS(). While it is true you cannot read an .RDS with the Input tool, if you have the predictive tools installed, the answer is still simple.

No valid metadata

One tool and three lines of code is all it takes to read an .RDS file. This assumes that the object in the .RDS file is a data frame or can be coerced to a data frame.

Valid Metadata

All good, right? The workflow completes successfully except for that annoying “valid metadata” error. The thing is, it goes away when you run the workflow, but it will come back if you make any changes to the code inside of the R tool. Even worse is that if you use the R tool inside of a macro, then your user will receive the same error notification.

The error isn’t even an error so much as it is a warning that metadata isn’t being passed from the R tool to downstream tools. For example, when you connect a Select tool to an Input tool, the Select tool automagically knows the field names, types and other information about the incoming data.

Alteryx > Select Tool” src=”/wp-content/uploads/sites/default/files/blog-content/RDS2.png” /></p>
<p>However, if you pass that same input through an R tool, you don’t get the same results until you run the workflow. And if you change anything in the R tool, then the error message comes back. The error is Alteryx’s way of telling you that the downstream tools don’t know any information about the fields that are passed through until you run the workflow.</p>
<p align=New Workflow

Make It Stop!

Alright so now we know why the error appears but what to do about making it stop? First, we need to use the function read.AlteryxMetaInfo(), which will read the metadata from the R tool inputs as a data frame. Then we need to write the metadata to the appropriate output using write.AlteryxAddFieldMetaInfo(). To enable this, we need to make sure that Run Script When Refreshed is checked near the top of the R tool configuration. This will make sure to run the entire block of code in the R tool each time Alteryx is refreshed.

The trick is that the functions read.AlteryxMetaInfo() and write.AlteryxAddFieldMetaInfo() only work when Alteryx is refreshing. They do not work when Alteryx is running. Fortunately, we have an environment variable to help us determine when Alteryx is running vs. when it is simply refreshing: AlteryxFullUpdate. AlteryxFullUpdate is TRUE when Alteryx is refreshing and it is FALSE otherwise. Using an if-else statement, we can run the metadata update during a refresh and run the rest of the code during a full run.

Here is the pseudocode:

IF AlteryxFullUpdate THEN

  Read Metadata From Inputs

  Write Metadata to Outputs

ELSE

  Run Full Script

Here is an example of some working code:

#AlteryxFullUpdate is TRUE while refreshing

#in this case, run the metadata update

if (AlteryxFullUpdate) {

  #read the metadata from input #1

  m.data <- read.AlteryxMetaInfo("#1")



  #one row of m.data is metadata for one column from input #1

  #loop through rows, writing metadata for each row

  lapply(seq(nrow(m.data)), function(x) {

    col <- m.data[x,]

    write.AlteryxAddFieldMetaInfo(nOutput = 1,

                                  name = as.character(col$Name),

                                  fieldType = as.character(col$Type),

                                  size = col$Size,

                                  scale = col$Scale)

  })



#if AlteryxFullUpdate isn’t true the workflow is running

#in that case, just run the full script

} else {

  data <- read.Alteryx("#1", mode="data.frame")

  write.Alteryx(data, 1)

}

You can copy this into the R tool and observe how metadata is passed through and you don’t receive any metadata errors. Change the #1 input to the R tool, press F5 and check that the metadata from the #1 output has updated. If you are having trouble, check the attached module for a working version.

Input-rds

More About the Author

Michael Treadwell

Data Engineering Lead
Resources from the ‘Advanced Analytics Done Right’ Webinar Predictive analytics is more than just dragging and dropping tools in to your workflow. Predictive analytics drive business critical ...
Scraping Web Data with Alteryx: Session Recap I’d like to say a big Thank you! to everyone that came to my session at Alteryx Inspire today! I always enjoy having the opportunity to ...

See more from this author →

Subscribe to our newsletter

  • I understand that InterWorks will use the data provided for the purpose of communication and the administration my request. InterWorks will never disclose or sell any personal data except where required to do so by law. Finally, I understand that future communications related topics and events may be sent from InterWorks, but I can opt-out at any time.
  • This field is for validation purposes and should be left unchanged.

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK