Tirthajyoti Sarkar em 05/12/2017 no site Medium
Python is fast emerging as the de-facto programming language of choice for data scientists. But unlike R or Julia, it is a general purpose language and does not have a functional syntax to start analyzing and transforming numerical data right out of the box. So, it needs specialized library.
Numpy, short for Numerical Python, is the fundamental package required for high performance scientific computing and data analysis in Python ecosystem. It is the foundation on which nearly all of the higher-level tools such as Pandasand scikit-learn are built. TensorFlow uses NumPy arrays as the fundamental building block on top of which they built their Tensor objects and graphflow for deep learning tasks (which makes heavy use of linear algebra operations on a long list/vector/matrix of numbers).
Many Numpy operations are implemented in C, avoiding the general cost of loops in Python, pointer indirection and per-element dynamic type checking. The speed boost depends on which operations you’re performing. For data science and modern machine learning tasks, this is an invaluable advantage.
My recent story about demonstrating the advantage of Numpy-based vectorization of simple data transformation task caught some fancy and was well received by readers. There was some interesting discussion on the utility of vectorization over code simplicity and such.
Now, mathematical transformation based on some predefined condition are fairly common in data science tasks. And it turns out one can easily vectorize simple blocks of conditional loops by first turning them into functions and then using
numpy.vectorize
method. In my previous article I showed an order of magnitude speed boost for numpy vectorization of simple mathematical transformation. For the present case, the speedup is less dramatic, as the internal conditional looping is still somewhat inefficient. However, there is at least 20–50% improvement in the execution time over other plain vanilla Python codes.
Here is the simple code to demonstrate it:
import numpy as npfrom math import sin as snimport matplotlib.pyplot as pltimport time
# Number of test pointsN_point = 1000
# Define a custom function with some if-else loopsdef myfunc(x,y):if (x>0.5*y and y<0.3):return (sn(x-y))elif (x<0.5*y):return 0elif (x>0.2*y):return (2*sn(x+2*y))else:return (sn(y+x))
# List of stored elements, generated from a Normal distributionlst_x = np.random.randn(N_point)lst_y = np.random.randn(N_point)lst_result = []
# Optional plots of the dataplt.hist(lst_x,bins=20)plt.show()plt.hist(lst_y,bins=20)plt.show()
# First, plain vanilla for-loopt1=time.time()for i in range(len(lst_x)):x = lst_x[i]y= lst_y[i]if (x>0.5*y and y<0.3):lst_result.append(sn(x-y))elif (x<0.5*y):lst_result.append(0)elif (x>0.2*y):lst_result.append(2*sn(x+2*y))else:lst_result.append(sn(y+x))t2=time.time()
print("\nTime taken by the plain vanilla for-loop\n----------------------------------------------\n{} us".format(1000000*(t2-t1)))
# List comprehensionprint("\nTime taken by list comprehension and zip\n"+'-'*40)%timeit lst_result = [myfunc(x,y) for x,y in zip(lst_x,lst_y)]
# Map() functionprint("\nTime taken by map function\n"+'-'*40)%timeit list(map(myfunc,lst_x,lst_y))
# Numpy.vectorize methodprint("\nTime taken by numpy.vectorize method\n"+'-'*40)vectfunc = np.vectorize(myfunc,otypes=[np.float],cache=False)%timeit list(vectfunc(lst_x,lst_y))
# ResultsTime taken by the plain vanilla for-loop----------------------------------------------2000.0934600830078 usTime taken by list comprehension and zip----------------------------------------1000 loops, best of 3: 810 µs per loopTime taken by map function----------------------------------------1000 loops, best of 3: 726 µs per loopTime taken by numpy.vectorize method----------------------------------------1000 loops, best of 3: 516 µs per loop
Notice that I have used %timeit Jupyter magic command everywhere I could write the evaluated expression in one line. That way I am effectively running at least 1000 loops of the same expression and averaging the execution time to avoid any random effect. Consequently, if you run this whole script in a Jupyter notebook, you may slightly different result for the first case i.e. plain vanilla for-loop execution, but the next three should give very consistent trend (based on your computer hardware).
We see the evidence that, for this data transformation task based on a series of conditional checks, the vectorization approach using numpy routinely gives some 20–50% speedup compared to general Python methods.
It may not seem a dramatic improvement, but every bit of time saving adds up in a data science pipeline and pays back in the long run! If a data science job requires this transformation to happen a million times, that may result in a difference between 2 days and 8 hours.
In short, wherever you have a long list of data and need to perform some mathematical transformation over them, strongly consider turning those python data structures (list or tuples or dictionaries) into
numpy.ndarray
objects and using inherent vectorization capabilities.
Numpy provides a C-API for even faster code execution but it takes away the simplicity of Python programming. This Scipy lecture note shows all the related options you have in this regard.
There is an entire open-source, online book on this topic by a French neuroscience researcher. Check it out here.
If you have any questions or ideas to share, please contact the author at tirthajyoti[AT]gmail.com. Also you can check author’s GitHub repositoriesfor other fun code snippets in Python, R, or MATLAB and machine learning resources. If you are, like me, passionate about machine learning/data science/semiconductors, please feel free to add me on LinkedIn.
Thank you!
ResponderExcluirVery interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ResponderExcluirCorrelation vs Covariance
Simple linear regression
Very interesting blog. Many blogs I see these days do not really provide anything that attracts others, but believe me the way you interact is literally awesome.You can also check my articles as well.
ResponderExcluirData Science In Banglore With Placements
Data Science Course In Bangalore
Data Science Training In Bangalore
Best Data Science Courses In Bangalore
Data Science Institute In Bangalore
Thank you..
I have to search sites with relevant information on given topic and provide them to teacher our opinion and the article.
ResponderExcluirSimple Linear Regression
Correlation vs Covariance
Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ResponderExcluirCorrelation vs Covariance
Simple linear regression
data science interview questions
Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
ResponderExcluirCorrelation vs Covariance
Simple Linear Regression
data science interview questions
KNN Algorithm
Logistic Regression explained
Cool stuff you have and you keep overhaul every one of us
ResponderExcluirdata science interview questions
Very nice blogs!!! i have to learning for lot of information for this sites…Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing, data sciecne course in hyderabad
ResponderExcluirWonderful article, very useful and well explanation. Your post is extremely incredible. I will refer this to my candidates...data science courses
ResponderExcluirThis is a wonderful article, Given so much info in it, These type of articles keeps the users interest in the website, and keep on sharing more ... good luck.
ResponderExcluirSimple Linear Regression
Correlation vs Covariance
I must say you are very much concise and experienced at persuasive writing. I just loved your flair of writing.
ResponderExcluirData Science training in Mumbai
Data Science course in Mumbai
SAP training in Mumbai
The content is well acknowledged, so no one could allege that it is just one person's opinion yet it covers and justifies all the applicable points. I have read such a startling work after a long time!
ResponderExcluirData Science training in Mumbai
Data Science course in Mumbai
SAP training in Mumbai
This is my first time i visit here. I found so many entertaining stuff in your blog, especially its discussion. From the tons of comments on your articles, I guess I am not the only one having all the leisure here! Keep up the good work. I have been meaning to write something like this on my website and you have given me an idea.data scientist training in hyderabad
ResponderExcluirFantastic blog extremely good well enjoyed with the incredible informative content which surely activates the learners to gain the enough knowledge. Which in turn makes the readers to explore themselves and involve deeply in to the subject. Wish you to dispatch the similar content successively in future as well.
ResponderExcluirdata science certification in bangalore
Thanks for posting the best information and the blog is very important.data science interview questions and answers
ResponderExcluirI surely acquiring more difficulties from each surprisingly more little bit of it
ResponderExcluirdata scientist training and placement
Honestly speaking this blog is absolutely amazing in learning the subject that is building up the knowledge of every individual and enlarging to develop the skills which can be applied in to practical one. Finally, thanking the blogger to launch more further too.
ResponderExcluirdata science course in bangalore with placement
Highly appreciable regarding the uniqueness of the content. This perhaps makes the readers feels excited to get stick to the subject. Certainly, the learners would thank the blogger to come up with the innovative content which keeps the readers to be up to date to stand by the competition. Once again nice blog keep it up and keep sharing the content as always.
ResponderExcluirData Science Course in Bhilai
Highly appreciable regarding the uniqueness of the content. This perhaps makes the readers feels excited to get stick to the subject. Certainly, the learners would thank the blogger to come up with the innovative content which keeps the readers to be up to date to stand by the competition. Once again nice blog keep it up and keep sharing the content as always.
ResponderExcluirData Science Course in Bhilai
This is a great article thanks for sharing this informative information. I will visit your blog regularly for some latest posts. I will visit your blog regularly for Some latest posts.
ResponderExcluirdata scientist course in hyderabad
Great to become visiting your weblog once more, it has been a very long time for me. Pleasantly this article i've been sat tight for such a long time. I will require this post to add up to my task in the school, and it has identical subject along with your review. Much appreciated, great offer. data science course in nagpur
ResponderExcluirLearn an in-depth, real-time understanding of the Data Science domain by enrolling for the AI Patasala advanced Data Science Course in Hyderabad.
ResponderExcluirData Scientist Training in Hyderabad
Informative blog
ResponderExcluirCloud Computing in hyderabad
This is very useful post for me. This will absolutely going to help me in my project.
ResponderExcluirdata science training in malaysia
Your content is nothing short of brilliant in many ways. I think this is engaging and eye-opening material. Thank you so much for caring about your content and your readers.
ResponderExcluirdata analytics courses in hyderabad
Thanks for posting the best information and the blog is very good.data science course in ranchi
ResponderExcluirThanks for posting the best information and the blog is very good.data analytics course in rajkot
ResponderExcluirThanks for posting the best information and the blog is very good.data science course in udaipur
ResponderExcluirThanks for posting the best information and the blog is very good.data science training in ranchi
ResponderExcluirThanks for posting the best information and the blog is very good.business analytics course in rajkot
ResponderExcluirThanks for posting the best information and the blog is very good.business analytics course in ranchi
ResponderExcluirThanks for posting the best information and the blog is very good.
ResponderExcluirThanks for posting the best information and the blog is very good.data science training in udaipur
ResponderExcluirThanks for posting the best information and the blog is very good.business analytics course in udaipur
ResponderExcluirThanks for posting the best information and the blog is very good.
ResponderExcluirGood to visit your weblog again, it has been months for me. Nicely this article that i've been waiting for so long. I will need this post to total my assignment in the college, and it has the exact same topic together with your write-up. Thanks, good share.
ResponderExcluirdata science training institute in hyderabad
Hi buddies, it is a great written piece entirely defined, continue the good work constantly.
ResponderExcluircyber security course malaysia
I'm always looking online for articles that can help me. I think you also made some good comments on the functions. Keep up the good work!
ResponderExcluirdata science training in mangalore
This is a great post I saw thanks to sharing. This is really what I wanted to see, I hope they continue to share such a great article in the future.
ResponderExcluirdata science certification in mangalore
Very informative message! There is so much information here that can help any business start a successful social media campaign!
ResponderExcluirdata science training in london
Very informative message! There is so much information here that can help any business start a successful social media campaign!
ResponderExcluirdata science training in london
Thanks for sharing this great article we appreciate it, we provide instagram reels download freely and unlimited.
ResponderExcluirThanks for sharing this great article we appreciate it, we provide instagram reels download freely and unlimited.
ResponderExcluirThanks for sharing this great article we appreciate it, we provide instagram reels download freely and unlimited.
ResponderExcluirYou re in point of fact a just right webmaster. The website loading speed is amazing. It kind of feels that you're doing any distinctive trick. Moreover, The contents are masterpiece. you have done a fantastic activity on this subject!data science training in roorkee
ResponderExcluirWow, what great information on World Day, your exceptionally nice educational article. a debt of gratitude is owed for the position.
ResponderExcluirdata science training in mangalore
betmatik
ResponderExcluirkralbet
betpark
mobil ödeme bahis
tipobet
slot siteleri
kibris bahis siteleri
poker siteleri
bonus veren siteler
Y2JX
あなたのライティングスキルは素晴らしかったです。記事をポイントごとに簡単に説明していただきました。本当に役に立ちました。貴重な記事を共有していただきありがとうございます。
ResponderExcluirインスタグラムリールのダウンロード
thanks for this idea
ResponderExcluirinsta dp viewer
Learn to perform Data Mining, Data Cleansing, Data Exploring, Feature Engineering, Prediction Model, and Data Visualization with the Data analytics coaching in Bangalore. Learn to extract business-focused insights from data with the help of mathematics and statistics. Hone your skills with the combined pedagogy approach in classrooms and extensive student-faculty interaction that helps identify students for our internship program giving you the feel of a real-world professional environment.
ResponderExcluirdata analyst course in bangalore with placement
Gain mastery over the core principles of data analytics and get ready to work with top companies. Get acquainted with the bright and exciting future of data science by enrolling in the best data analytics institute in Bangalore. Learn to empower more meaningful business decisions by representing data with tools of visualization.data analyst course in bangalore
ResponderExcluirGain mastery over the core principles of data analytics and get ready to work with top companies. Get acquainted with the bright and exciting future of data science by enrolling in the best data analytics institute in Bangalore. Learn to empower more meaningful business decisions by representing data with tools of visualization.data analyst course in bangalore
ResponderExcluirGain mastery over the core principles of data analytics and get ready to work with top companies. Get acquainted with the bright and exciting future of data science by enrolling in the best data analytics institute in Bangalore. Learn to empower more meaningful business decisions by representing data with tools of visualization.data analyst course in bangalore
ResponderExcluirGain mastery over the core principles of data analytics and get ready to work with top companies. Get acquainted with the bright and exciting future of data science by enrolling in the best data analytics institute in Bangalore. Learn to empower more meaningful business decisions by representing data with tools of visualization.data analyst course in bangalore
ResponderExcluirData analyst handles structured and unstructured and data that is generated at an unprecedented rate every day. Anyone with a strong statistical background and an analytical mindset enjoys the challenges of big data that involves building data models and software platforms along with creating attractive visualizations and machine learning algorithms. Sign up for the Data Science courses in chennai with Placements and get access to resume building and mock interviews that will help you get placed with top brands in this field.
ResponderExcluirdata analyst course in chennai
Enroll in the Data Science course near me to learn the handling of huge amounts of data by analyzing it with the help of analytical tools. This field offers ample job profiles to work as a Data Architect, Data Administrator, Data Analyst, Business Analyst, Data Manager, and BI Manager. Step into an exciting career in the field of Data Science and achieve great heights by acquiring the right knowledge and skills to formulate solutions to business problems.
ResponderExcluirdata analyst course in bangalore
Enroll in the Data Science course near me to learn the handling of huge amounts of data by analyzing it with the help of analytical tools. This field offers ample job profiles to work as a Data Architect, Data Administrator, Data Analyst, Business Analyst, Data Manager, and BI Manager. Step into an exciting career in the field of Data Science and achieve great heights by acquiring the right knowledge and skills to formulate solutions to business problems.
ResponderExcluirdata analyst course in bangalore