Posts

Objects and memory in python

I read this medium  article on python memory management. It led me down a little, python data types and memory rabbit hole. Thinking back to CS-101 class you learn about two kinds of objects, mutable and immutable. In python they are: Immutable Object : int, float, long, complex, string tuple, bool Mutable Object : list, dict, set, byte array, user-defined classes I'll be referencing  CPython, not other implementations like PyPy. When investigating how things are working under the hood, id() is an invaluable. T his function returns the memory address of the object. No two objects have the same identity. But as you'll see, two variables can reference the same object. Along with id(), is returns whether variables reference the same object. This is different from the equality operator ==. Python variables work a lot like pointers in other languages. They refer to a value stored somewhere in memory. If it is a mutable object and the value changes the memory ...

Random Thoughts For the Day

Random Thoughts For the Day “The plural of anecdote is not data.” Data Science and the Art of Persuasion Article from Scott Berinato,  author of Good Charts ,    From the January – February 2019 issue Did you know  Florence Nightingale created  Coxcomb Charts ?  A story you probably heard in your stats 101 class, Student's t test and Guinness

UNIX Quick Hit

UNIX Quick Hit I was doing some work on the command line and thought it would be helpful to list out some useful Unix tidbits. List files in human readable format $ ls -lhS -l     use a long listing format -h     with -l, print sizes in human readable format (e.g., 1K 234M 2G) -S     sort by file size You can use man ls to see all possible parameters FIND Recursively find all files with extension .pyc $ find <directory> -type f -name "*.pyc" For example, if you were looking for all .pyc files in your local directory and all sub directories you would type $ find . -type f -name "*.pyc" How to Create an SSH Shortcut If you find yourself ssh'n to the same host repeatedly, I recommend creating a shortcut for this command in the config file in your .ssh directory. $ cd ~/.ssh $ vim config From here, you can now create shortcuts. You can specify the hostname, username, port, and the private key. For a ful...

WINDOW functions and LAG()

Window functions This is an example taken straight from the  PostgreSQL documentation So what is a window function and why would you use it? Well, window functions all you to perform aggregations for groups while keeping the rows separate. So lets say you want to know the average salary by department and use that to find each employees difference from the average. The average is not stored in you employee salary table (empsalary) so you need to calculate it. You can use the WINDOW function to create a "window" around each deparment, then, the aggregate function will only be applied to that window. SELECT depname, empno, salary,         avg(salary) OVER (PARTITION BY depname) FROM empsalary; Returns depname | empno | salary | avg -----------+-------+--------+----------------------- develop | 11 | 5200 | 5020.0000000000000000 develop | 7 | 4200 | 5020.0000000000000000 develop | 9 | 4500 | 5020.000000...

New books and podcasts!

Image
New Books Just got Storytelling with Data: A Data Visualization Guide for Business Professionals   by Cole Nussbaumer Knaflic in the mail.  Storytelling with Data The first line of the introduction is a quote from  Yale Professor Emeritus   Edward Tufte:  "Power corrupts. PowerPoint corrupts absolutely."  Wired magazine 2003 This book is focused on everything that goes into conveying information inside your organization. It can be difficult working across teams. I have found you really need to know your audience to create the most effective visualization. Many times I have created what I thought was a great visualization with an obvious trend or insight only to get the response, "What is this this telling me?". Or "I just want to know xyz.". Similarly, there are time when I present a simple graph which spurs a series of questions that get's to what the stakeholder's REAL question was all along.  Also that quote for Tufte makes me want...

Chart Types & Styles

Image
Chart Types & Styles What is your process when you set out to make a data vizualization? Do you sketch what you are trying to show or discover? If using Tableau do you just start throwing variables into frame? Below is a useful though starter as you plan your data viz. Chart suggestions a thought starter Visualization of most common business data only requires 2 dimensional representation. Adding a third variable can be confusing for people who don't work with data everyday. So lets keep it simple and go through some of the decision making steps when deciding which visualization to use.  Three important things to consider are: Number of variables If you have more than four you may be in for a cluttered, unclear chart. Type of variables Numeric- Discrete or continuous Categorial - related or unrelated categories Association you are trying to show Relationship Multiple variables. Dependent or Independent. Comparison Difference or Similarity. Trends over...