Upcasting Link to official Docs Have you ever run into a scenario where you have set your column type to int but when you go to display it either in a report or visualization it comes out a float? This happens because of something called upcasting. "Types can potentially be upcasted when combined with other types, meaning they are promoted from the current type (e.g. int to float)." Lets start with a DataFrame of 1 column and 8 rows where the values are random numbers from the normal distribution. >>> import pandas as pd >>> import numpy as np >>> df1 = pd.DataFrame(np.random.randn(8, 1), columns=['A'], dtype='float32') >>> df1 A 0 0.406792 1 0.810450 2 1.161985 3 -1.402411 4 1.385434 5 -1.091746 6 0.018586 7 -0.606741 Now lets create a 3x8 DataFrame >>> df2 = pd.DataFrame( dict( A =...
Chart Types & Styles What is your process when you set out to make a data vizualization? Do you sketch what you are trying to show or discover? If using Tableau do you just start throwing variables into frame? Below is a useful though starter as you plan your data viz. Chart suggestions a thought starter Visualization of most common business data only requires 2 dimensional representation. Adding a third variable can be confusing for people who don't work with data everyday. So lets keep it simple and go through some of the decision making steps when deciding which visualization to use. Three important things to consider are: Number of variables If you have more than four you may be in for a cluttered, unclear chart. Type of variables Numeric- Discrete or continuous Categorial - related or unrelated categories Association you are trying to show Relationship Multiple variables. Dependent or Independent. Comparison Difference or Similarity. Trends over...
Window functions This is an example taken straight from the PostgreSQL documentation So what is a window function and why would you use it? Well, window functions all you to perform aggregations for groups while keeping the rows separate. So lets say you want to know the average salary by department and use that to find each employees difference from the average. The average is not stored in you employee salary table (empsalary) so you need to calculate it. You can use the WINDOW function to create a "window" around each deparment, then, the aggregate function will only be applied to that window. SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary; Returns depname | empno | salary | avg -----------+-------+--------+----------------------- develop | 11 | 5200 | 5020.0000000000000000 develop | 7 | 4200 | 5020.0000000000000000 develop | 9 | 4500 | 5020.000000...
Comments
Post a Comment