how to use Hive to retrieve data from Hadoop Distributed File System (HDFS)
We have business data stored in Parquet files on Hadoop Distributed File System (HDFS). Our Data Solutions team asks us to use Hive (HiveQL) to retrieve data. Does PRPC support using HiveQL to retrieve data from HDFS? If yes, can you please provide related technical/help documents on how to do it? If not, what is the approach that Pega recommends? We are using Pega Platform 8.2.3
***Edited by Moderator: Lochan to tag SR***
We dont support HiveQL. However if you have your data in the form of parquet files in HDFS , you can use the HDFS Dataset to map and access the data. Please refer to the links below for more information,
Keep up to date on this post and subscribe to comments
- ERROR: Retrieve CSV file / data from HDFS File System
- Hadoop/Hive integration with Pega
- Hadoop Hive JDBC connection from Websphere using Datasource
- Why RDB Methods are only used to Retrieve Data From External Systems/Data bases Explain the Reason?
- When user searches for records in the data table in manager portal, system is retrieving and showing only first 500 results irrespective of number of records are present in table.