Open in app
Home
Notifications
Lists
Stories

Write
Gianmario Spacagna
Gianmario Spacagna

Home

Published in Towards Data Science

·Pinned

Embedding billions of text documents using Tensorflow Universal Sentence Encoder and Spark EMR

Tensorflow HUB makes available a variety of pre-trained models ready to use for inference. A very powerful model is the (Multilingual) Universal Sentence Encoder that allows embedding bodies of text written in any language into a common numerical vector representation.

Tensor Flow

8 min read

Embedding billions of text documents using Tensorflow Universal Sentence Encoder and Spark EMR
Embedding billions of text documents using Tensorflow Universal Sentence Encoder and Spark EMR

Published in Towards Data Science

·Pinned

Extracting rich embedding features from COCO pictures using PyTorch and ResNeXt-WSL

How to leverage a powerful pre-trained convolution neural network to extract embedding vectors for pictures. — In this tutorial, I will show you how to leverage a powerful pre-trained convolution neural network to extract embedding vectors that can accurately describe any kind of picture in an abstract latent feature space. I will show some examples of using ResNext-WSL on the COCO dataset using the library PyTorch and…

Computer Vision

8 min read

Extracting rich embedding features from COCO pictures using PyTorch and ResNeXt-WSL
Extracting rich embedding features from COCO pictures using PyTorch and ResNeXt-WSL

Published in Towards Data Science

·Jan 9, 2021

A novel approach to Document Embedding using Partition Averaging on Bag of Words

How to take a collection of vector embeddings and average them preserving the multi-sense topicality of their manifold structures. — This is the third article of the “Embed, Cluster, and Average” series. Before diving deep into this tutorial, I recommend reading first the previous two articles: Extracting rich embedding features from pictures using PyTorch and ResNeXt-WSL and Manifold clustering in the embedding space using UMAP and GMM.

Document Embedding

9 min read

A novel approach to Document Embedding using Partition Averaging on Bag of Words
A novel approach to Document Embedding using Partition Averaging on Bag of Words

Published in Towards Data Science

·Jan 2, 2021

Manifold clustering in the embedding space using UMAP and GMM

How to reduce the dimensionality of embedding vectors and preserving manifold structures grouped into clusters. — In the previous article Extracting rich embedding features from pictures using PyTorch and ResNeXt-WSL we have seen how to represent pictures into a multi-dimensional numerical embedding space. We have also seen the effectiveness of the embedding space to represent similar pictures closely to each other. In this tutorial, we will…

Manifold Learning

9 min read

Manifold clustering in the embedding space using UMAP and GMM
Manifold clustering in the embedding space using UMAP and GMM

Published in Vademecum of Practical Data Science

·Nov 30, 2020

The Manager’s Non-Technical Guide to Machine Learning

Over the last decade, I have worked with highly talented data science teams from several different industries, including marketing, advertising, automotive, financial services, and cybersecurity. I have contributed to most of the lifecycle phases, worked with executives and stakeholders across many different functions, and seen recent advancements in the machine…

Machine Learning

2 min read

The Manager’s Non-Technical Guide to Machine Learning
The Manager’s Non-Technical Guide to Machine Learning

Published in Vademecum of Practical Data Science

·Dec 19, 2019

Knowledge Graphs and Causality

This piece is part of a series on 2019 trends in the AI and Machine Learning industry. You can read my full thoughts on the past year in this summary I wrote for the Helixa blog, which also includes links to the other in-depth pieces in this series. Symbolic AI…

Knowledge Graph

3 min read

Knowledge Graphs and Causality
Knowledge Graphs and Causality

Published in Vademecum of Practical Data Science

·Dec 17, 2019

Tech stack and common tools for developing AI

This piece is part of a series on 2019 trends in the AI and Machine Learning industry. You can read my full thoughts on the past year in — — this summary I wrote for the Helixa blog, which also includes links to the other in-depth pieces in this series. …

Tech Stack

3 min read

Tech stack and common tools for developing AI
Tech stack and common tools for developing AI

Published in Vademecum of Practical Data Science

·Dec 16, 2019

Off-the-shelf Models and AutoML

This piece is part of a series on 2019 trends in the AI and Machine Learning industry. You can read my full thoughts on the past year in this summary I wrote for the Helixa blog, which also includes links to the other in-depth pieces in this series. There is…

Automl

4 min read

Off-the-shelf Models and AutoML
Off-the-shelf Models and AutoML

Published in Vademecum of Practical Data Science

·Dec 12, 2019

Federated Learning and Differential Privacy

This piece is part of a series on 2019 trends in the AI and Machine Learning industry. You can read my full thoughts on the past year in this summary I wrote for the Helixa blog, which also includes links to the other in-depth pieces in this series. “Federated Learning”…

Federated Learning

4 min read

Federated Learning and Differential Privacy
Federated Learning and Differential Privacy

Published in Vademecum of Practical Data Science

·Dec 10, 2019

Ethics and Responsible AI

This piece is part of a series on 2019 trends in the AI and Machine Learning industry. You can read my full thoughts on the past year in this summary I wrote for the Helixa blog, which also includes links to the other in-depth pieces in this series. 2019 was…

Ethics

3 min read

Ethics and Responsible AI — Vademecum of Practical Data Science
Ethics and Responsible AI — Vademecum of Practical Data Science
Gianmario Spacagna

Gianmario Spacagna

Director of Artificial Intelligence at Brainly

Following
  • Dropbox

    Dropbox

  • Alberto Prospero

    Alberto Prospero

  • Data Science Milan

    Data Science Milan

  • Harlan Harris

    Harlan Harris

  • Yan Cui

    Yan Cui

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Knowable