Class Schema

java.lang.Object
ai.rapids.cudf.Schema

public class Schema extends Object
The schema of data to be read in.
  • Field Details

    • INFERRED

      public static final Schema INFERRED
  • Method Details

    • getChild

      public Schema getChild(int i)
      Get the schema of a child element. Note that an inferred schema will have no children.
      Parameters:
      i - the index of the child to read.
      Returns:
      the new Schema
      Throws:
      IndexOutOfBoundsException - if the index is not in the range of children.
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • builder

      public static Schema.Builder builder()
    • getFlattenedColumnNames

      public String[] getFlattenedColumnNames()
      Get names of the columns flattened from all levels in schema by depth-first traversal.
      Returns:
      An array containing names of all columns in schema.
    • getColumnNames

      public String[] getColumnNames()
      Get names of the top level child columns in schema.
      Returns:
      An array containing names of top level child columns.
    • isNested

      public boolean isNested()
      Check if the schema is nested (i.e., top level type is LIST or STRUCT).
      Returns:
      true if the schema is nested, false otherwise.
    • hasNestedChildren

      public boolean hasNestedChildren()
      This is really for a top level struct schema where it is nested, but for things like CSV we care that it does not have any children that are also nested.
    • getFlattenedTypeIds

      public int[] getFlattenedTypeIds()
      Get type ids of the columns flattened from all levels in schema by depth-first traversal.
      Returns:
      An array containing type ids of all columns in schema.
    • getFlattenedTypeScales

      public int[] getFlattenedTypeScales()
      Get scales of the columns' types flattened from all levels in schema by depth-first traversal.
      Returns:
      An array containing type scales of all columns in schema.
    • getFlattenedDecimalPrecisions

      public int[] getFlattenedDecimalPrecisions()
      Get decimal precisions of the columns' types flattened from all levels in schema by depth-first traversal.

      This is used to pass down the decimal precisions from Spark to only the JNI layer, where some JNI functions require precision values to perform their operations. Decimal precisions should not be consumed by any libcudf's APIs since libcudf does not support precisions for fixed point numbers.

      Returns:
      An array containing decimal precision of all columns in schema.
    • getFlattenedTypes

      public DType[] getFlattenedTypes()
      Get the types of the columns in schema flattened from all levels by depth-first traversal.
      Returns:
      An array containing types of all columns in schema.
    • getChildTypes

      public DType[] getChildTypes()
      Get types of the top level child columns in schema.
      Returns:
      An array containing types of top level child columns.
    • getNumChildren

      public int getNumChildren()
      Get number of top level child columns in schema.
      Returns:
      Number of child columns.
    • getFlattenedNumChildren

      public int[] getFlattenedNumChildren()
      Get numbers of child columns for each level in schema.
      Returns:
      Numbers of child columns for all levels flattened by depth-first traversal.
    • getType

      public DType getType()
    • isStructOrHasStructDescendant

      public boolean isStructOrHasStructDescendant()
      Check to see if the schema includes a struct at all.
      Returns:
      true if this or any one of its descendants contains a struct, else false.
    • asHostDataType

      public HostColumnVector.DataType asHostDataType()