You use –split-by clause but it still does not give optimal performance. How can you then improve the performance further?

In such situations, the –boundary-query clause can be used. Generally, Sqoop uses the SQL query select min(), max() from to determine the boundary values for creating splits. However, if this query is not optimal, then using the –boundary-query argument any random query can be written to generate two numeric columns.