Data Import - Validating Data Type When Importing a CSV file

PIPE-FLO® Professional's powerful Data Import feature takes advantage of Python™ scripting to generate or retrieve data. This greatly speeds up the modeling process but can present issues without good data. Users may find it difficult locating and correcting bad data in a .csv file being read. If it is run with invalid data types or empty fields it would abort the script and return an error for the line of code where the exception was encountered. Fortunately, Python™ provides some avenues for us to catch these problems in advance and deal with them in a more expedient manner. Making this extra effort results in a Data Import script that is robust, accurate and time saving across multiple files and users.

The example below reads flow demand data (name, flow, and elevation) from a CSV file, checks the data for validity, and imports the valid items into a PIPE-FLO Professional model. Where exceptions are encountered, they are tallied and reported in the Data Import Output window. The flow demand names correspond to Ultra-pure water system.pipe, which can also be found in My Engineering Files>Projects>Sample Systems. The text below can also be copied into a text editor to create the  UPW_Data.csv  and files necessary to run this example in PIPE-FLO.

#The first step is to load some standard Python modules to read the .csv file and perform the file operations:

The  UPW_data.csv file consists of flow demand name, flow rate, and elevation in comma delimited rows. Notice that there are some erroneous or empty values. This block of text can be copied into a text editor and saved as  UPW_Data.csv :


The Data Import file locates UPW_Data.csv and uses a loop to read, test, and set the values in PIPE-FLO. Errors and update counts are printed to the Output window where they occur. The individual pieces of this script will be discussed and the complete version included at the end. This script has been updated for use with PIPE-FLO Professional 17. 

#The first step is to load some standard Python modules to read the .csv file and perform the file operations:

import sys
import csv
import os

#With the modules loaded, we can define variables for the file path. In this case, to look in the directory where
#this .py file is and find a file named UPW_Data.csv.

dir = os.path.dirname( os.path.abspath( __file__ ) )
data_file_path = os.path.join( dir, 'UPW_Data.csv' )

#The script counts and reports the number of updates performed and errors encountered while running. These count
#variables need to be initialized to 0.

errors = 0
updates = 0

#Note: From this point forward, pay attention to the tabbing in the file. It indicates what statements are included
#in the various statement and loops.

#The file then needs to be opened by the script using the csv reader module:

with open( data_file_path ) as csvfile:
	reader = csv.reader( csvfile )    

#The FOR loop is used to read each row of the file, and assign the values from those rows to variables. This loop will
#end when it runs out of data rows in the .csv file:

	for data_row in reader:
		name = data_row[0]
		flow = data_row[1]
		elevation = data_row[2]   

#The try function will test the values, without causing an error that will abort the script. In this case, testing the name:
			name_test = pipeflo().doc().get_flow_demand( name )            

#If there is an exception (the name does not exist in the model) the script will print a line to the Output tab with the name
#of the device, add 1 to the error count variable and continue back to the top of the for loop:

		except RuntimeError:
			print( 'Did not find device', name )
			errors += 1

#The next try block tests to see that the flow variable value is valid by assigning it to a new variable as a floating point
#number. If the value is the wrong type, it is reported and the rest of that row is skipped by the continue statement. If it is a
#valid number, it is imported to the model and 1 is added to the update variable. This process is repeated for the elevation value.

			flow_tested = float( flow )

		except ValueError:
			print( 'Invalid flow value type: ', 'Device:', name,', Value:', flow )
			errors += 1
		pipeflo().doc().get_flow_demand( name ).set_operation(operation(flow_rate(flow_tested , gpm) ) )
		updates += 1

#NOTE: To set the flow demand values with PIPE-FLO 14 use the commented out function below instead:
		#pipeflo().doc().get_flow_demand( name ).set_flow( flow_tested)
			elevation_tested = float( elevation )

		except ValueError:
			print( 'Invalid elevation value type: ', 'Device:', name,', Value:', elevation )
			errors += 1

		pipeflo().doc().get_flow_demand( name ).set_elevation( elevation_tested )
		updates += 1

#Once every row in the .csv file has been tested and imported, a summary is printed to the Data Import Output dialog:
print( 'Import finished.' )
if updates > 0:
	print( updates, 'items updated.' )
if errors > 0:
	print( errors, 'errors.' )

Once created, the file can be opened in PIPE-FLO by dragging the file onto the FLO-Sheet or clicking the Data Import button and open the file from there. Important: The .csv file must be properly named and in the same folder as the .py file for this script to work.

Notice that the script prints the test failures and update/error counts in the Output tab of the Data Import dialog, and the Errors tab is empty. It should look like this:

Invalid flow value type: Device: GS-124-10 , Value: null
Invalid elevation value type: Device: GS-124-13 , Value:
Did not find device GS-124-16
Import finished.
21 items updated.
3 errors.

The real value in this script is not the time saved to input the values from the csv file for a single project, but the flexibility to drop any .csv file in this folder with the appropriate column format into the folder and rerun the script at any time. This could be for future revisions to this model or be applied to any PIPE-FLO model and .csv file pair as long as the flow demand names match. In addition to the time saved, the reduction in manual data transcription leads to better data integrity. Essentially, the time investment to develop a flexible script can provide savings and accuracy on all current and future models. By modifying the variable to data row mapping, as well as the import commands, this script can also be modified to work with any ,pipe file, .csv file and devices.



Version 17

Version 16: