Most recent releases are shown at the top. Each release shows:
- New: New classes, methods, functions, etc
- Changed: Additional paramaters, changes to inputs or outputs, etc
- Fixed: Bug fixes that don't change documented behaviour
Note that the top-most release is changes in the unreleased master branch on Github. Parentheses after an item show the name or github id of the contributor of that change.
ImageDownloader()
widget for quick image datasets researchLearner.export
to export the state of aLearner
for inference (withCallback.get_state
to get the state of a callback behind the scenes)load_learner
to load aLearner
from an exported state (withload_callback
to load the state of a callback behind the scenes)- A dataset can also be a
Callback
if we want to apply changes at the beginning of every epoch
- If no label is provided, the test set has
EmptyLabel
for every item LanguageModelLoader
becomesLanguageModelPreLoader
and is a dataset to wrap in a pytorchDataLoader
- Avoid bugs in tabular by copying the dataframe in
TabularList.from_df
- Can properly change the batch size even if the
DataLoader
is anLanguageDataLoader
- Bug in
ImageBBox
when all the targets had the same number of bboxes - Default metric in
RNNLearner
is accuracy only for language models or classification tasks - Throws a clear error message when trying to use
databunch
on not-split data
Fbeta_binary
is nowFBeta
Learner.to_fp32()
to go back to FP32 precision modecont_cat_split
function to automatically get categorical/continuous variables (thanks to RealLankinen)- Lots of new metrics thanks to Sven Becker:
mse/mean_squared_error
,mae/mean_absolute_error
,rmse/root_mean_squared_error
,msle/ mean_squared_logarithmic_error
,explained_variance
,r2_score
,top_k_accuracy
,KappaScore
,MatthewsCorreff
,Precision
,Recall
,FBeta
BatchNorm1dFlat
for using batchnorm in sequence models (e.g. RNNs, and their inputs and outputs)
- The data block API has additional checks with assertions (NaNs in columns used for inputs/labels in dataframes, empty items)
- kwargs are checked in the data block API
model_summary
now returns summary instead of printing it
- Predictions now work in FP16 mode
- Model is unwrapped at the end of a distributed training (thanks to mgrankin)
DataBunch.export
works for multi-classification problems whereone_hot=True
- Fix bug in
DatasetFormatter
- Fix
LanguageLearner.predict
- If you want to import basic fastai functionality without an application, you
should now use
from fastai.basics import *
instead offrom fastai import *
. (However note that you now don't need either, when using an application, as mentioned in Changed below) - In fastai.text batch is now the first dimension
fastai.script
module contains a simple decorator for quickly creating CLIssetup_distrib
does all setup required for distributed training for you- Sample training scripts for MNIST sample (single GPU) and CIFAR10 (multi-GPU fp16) in
examples
fastai.launch
module for simplified single-machine multi-GPU trainingcheck_perf
- performance improvement recommendationsdistributed
module with helper functions to quickly launch a distributed training- temptative use of JIT C++ extensions to code the QRNN with
batch_first
argument, it needs a proper installation of cuda to be compiled at execution time
- When importing an application such as
from fastai.vision import *
you no longer need to alsofrom fastai import *
SequentialEx
,MergeLayer
, andres_block
to more easily create resnet and densenet architecturesno_split
method in the data block APIsigmoid_range
function to scale sigmoid to given range, along withSigmoidRange
layerDataBunch
performs a sanity check after its initialization and will throw a warning if something is wrong with the data.- More GAN stuff:
gan_critic
,AdaptiveLoss
,accuracy_thresh_expand
, andGANDiscriminativeLR
- Support for one-hot encoded labels in multiclassification problems
- Add
Dataset.Fix
(same as train but withshuffle=False
,drop_last=False
and valid transforms)
- Experimental cross-connection from raw input plus extra resblock at end of unet
- Add an execution-time check for a specific version of fastprogress (
git pull
fastai updates) DataBunch.export
now serializes everything (transforms and normalization included)DataBunch
now hasfix_dl
attr, which is same data astrain_dl
but without shuffle or train tfmspred_batch
now hasreconstruct
param, which will reconstruct each prediction into an objectLearner.show_results
gives a better output for image classification tasks
- Windows fixes, including:
- Most transforms can now be used in Windows with
num_workers
>0 - Avoid recusion error with data blocks API
- Try to avoid default
np.int32
creation where possible
- Most transforms can now be used in Windows with
y_range
for unet output activationImage.apply_tfms
doesn't accept any kwargs anymoresplit_from_files
works withfrom_df
/from_csv
LabelLists.load_empty
(most useful for adding test sets for inference)
- Update deps to release version of pytorch v1
- pypi wheel
dataclasses
dependency for py3.6 is there again
Learner.interpret
is a shortcut toClassificationLearner.from_learner
.
- Language models now use flattened loss, instead of flattening y in data loader
ItemList.from_folder
now has aninclude
parameter to only include certain folders
Learner.load
won't throw an error when trying to load an optimizer state of the wrong size, and silently ignore that optimizer state loading
TabularDatBunch.from_df
accepts atest_df
argument
LanguageLearner.predict
now returns better text predictions- Unfreezing layers didn't create a new optimizer so the unfrozen layers weren't training
- Bug in
TextDataBunch
with a mistmatched test set was causing problems on the validation set
ImageCleaner
with duplicates=True to use as a duplicate detectorDatasetFormatter.from_similars()
to feed the most similar indexes intoImageCleaner
chunks
to separate a Collection into smaller iterablesbatchnorm_2d
wrapper for batchnorm with init
Learner.load
andLearner.save
will also load/save the optimizer stateImageItemList
now takes optionalconvert_mode
Image.show
now usesdefaults.cmap
if nocmap
passedbn
param inconv_layer
replaced bynorm_type
which takesNormType
enum- unet kwargs are passed down to
conv_layer
Learner.fit
no longer creates a new optimizer at each call- Add batchnorm to end of unet
- Restore
ImageDataBunch.single_from_classes
ItemList.set_item
is now a context manager, so you don't need to callclear_item
- Removed
ItemList.clear_item
- Init
torch.set_num_threads(4)
to avoid OpenMP process creation overhead
Tokenizer
wasn't using >1 thread
Learner.summary
add_datepart
DeviceDataLoader.new
method to get a copy of aDeviceDataLoader
while changing an attributeDataBunch.batch_size
allows to change the batch size of all the dataloaders
ImageDataBunch.single_from_classes
has been removedLearner.create_unet
is now calledunet_learner
- Every type of items now has a
reconstruct
method that does the opposite ofItemBase.data
: taking the tensor data and creating the object back Learner.show_results
now works across applicationsDataBunch.export
: saves the internal information (classes, vocab in text, processors in tabular etc) need for inference in a file named 'export.pkl'. You can then create anempty_data
object by usingDataBunch.load_empty(path)
(wherepath
points to where this 'export.pkl' file is). This also works across applications- GAN and CycleGAN
parallel
: Run a function on every element of an array, using multiple processesicnr
initializes a weight matrix with ICNRPixelShuffle_ICNR
layer that combines PixelShuffle, a suitable conv2d, plus optional weightnorm and(scale,scale)
blurringLearner.clip_grad
convenience function forGradientClipping
callbackplot_flat
,plot_multi
,show_multi
,show_all
: simple functions for showing images on subplotsItemList.to_text
to save items to a text fileItemList.filter_by_rand
to randomly sample itemsLabelList.transform_y
to use different transformation params fory
(thanks for Fred Monroe)LabelList.{to_df,to_csv}
to save items including labelsDataBunch
convenience properties:test_ds
andsingle_ds
DataBunch.single_item
to convert anItemBase
in to a batch (tensor + dummy y)Learner.pred_batch()
can now take an optional batch to predict, rather than grabbing its own- introduce
EmptyLabel
andEmptyLabelList
lr_range
now divides non-final layer LRs by 10, instead of 3, when called withslice(lr)
Learner.load
now has astrict
argument like Pytorch'sload_state_dict
- 1cycle training now uses cosine reverse annealing instead of linear
conv2d
andconv_linear
now initialize weights/bias by defaultcore.to_detach
now moves data to CPUvision.models.unet
now usesPixelShuffle_ICNR
for upsampling, with optional weightnorm and blurringvision.models.unet
final layer now has twice as many activationsone_batch
moved toDataBunch
, and candetach
anddenorm
if requestedHooks
andHook
can now be used as context managers- Moved some non-image-specific functions from
vision.image
totorch_core
- Change
grid_sample
to downsample smoothly - Reduce the number of hooked modules to just those required in
vision.models.unet
hook_output(s)
can also hook the backward/grad nowbn_final
param inTabularModel
andcreate_cnn
to add batchnorm after final affine layer
- factory methods of
TextDataBunch
acceptmax_vocab
(thanks to jfilter) vision.models.unet
now useseval
correctly when building model- classes are sorted when created to avoid having them change when restarting the notebook
- fix loading issues with the test set in
TextDataBunch
- fix random bug in
TextDataBunch.from_ids
(thanks to PiotrCzapla)
get_files
andget_image_files
now returnPath
s relative topath
, instead of relative to.
ItemList.items
are also relative topath
where relevant, sinceget_files
is called internallycreate_func
is removed in the data API; subclass and change theget
method instead (in vision, you can subclass theopen
method if you want to change how the images are opened)
Vocab
andTabularTransform
can now be saved- Each application has its method to create an inference learner
model_summary
function for standard models (thanks to @noklam)- Added
pca
totorch.Tensor
- Add methods to get embeddings from
CollabLearner
verify_image
- now fixes files with corrupt EXIF data
- We can add transform to
y
in the data block API - metric fbeta for single classification (thanks to wy-q)
- ItemLists can now set
self.filter_missing_y
to automatically remove items from LabelLists training set that can't be labeled - revert xxmaj token and
deal_caps
rule
- xxmaj token and new
deal_caps
rule
Tokenizer
haspre_rules
andpost_rules
now (for before and after tokenization)mark_fields
is now default toFalse
FloatList
to do regression- Use of real neural nets in
collab
- Remove
TextFilesList
as you can now useTextList
instead - Consistent use of
cols
/col
in the data block API depending on if you can pass multiple columns or not - Collab is refactored with the data block API behind the scene
get_collab_learner
andget_tabular_learner
becomecollab_learner
andtabular_learner
for name harmonization accross applicationsget_embedding
becomesembedding
ImageDeleter
andImageRelabeler
are merged intoImageCleaner
show_batch
works withrows=1
- Pretrained language models are saved in the correct folder (.fastai/models/)
- Splitting too slow in the data block API
- Mixup losses work with predict and TTA (thanks to bharadwaj6)
- Wrong size for the added test set in the data block API (thanks to wdhorton)
- Fix to the QRNN (thanks to PiotrCzapla)
- No changes
Learner.predict
works accross applicationsLearner.show_batch
works accross applications
tools/build-docs
andtools/update-nbs
scripts combined into one script- Big refactor of the data block API
download_images
works with different kind of suffixes (thanks to fpingham)
- We no longer import submodule names automatically with
import *
- Callbacks are now inside the
callbacks
namespace if youfrom fastai import *
- All the
DataBunch
factory method use the data block API, the factory method ofDatasets
are deprecated and will be removed in a future version
learn.predict
fixed- wrong dimension in dice (thanks to noklam)
CSVLogger
callback (thanks to devorfu)- Initial support for image regression problems
- If a dataset class has
learner_type
thencreate_cnn
uses that type to create theLearner
- Introduce TaskType in
DatasetBase
to deal with single/multi-class or regression problems accross applications
datasets()
now can automatically figure out what class to use in many situationsdownload_images()
now saves images with their original extensions
DataBunch.dl
replaces the variousholdout
,is_test
, andis_train
approaches with a single consistent enumfastai.text
is fully compatible with the data block API
download_url
reads the get request withiter_content
which is robust to 'content-length' errors. (thanks to Francisco Ingham and Zach Caceres)download_url
has a timeout
create_cnn
correctly calculates # features in body correctly for more architecturesTextDataset
has now two subclasses for the preprocessing steps and doesn't do that preprocesing automaticallyTextDataBunch
doesn't save the result of preprocessing automatically, you have to useTextDataBunch.save
RNNLearner.classifier
is nowtext_classifier_learner
andRNN_Learner.language_model
is nowlanguage_model_learner
pil2tensor
is faster and works on more image types (thanks to kasparlund)- Imports in the file picker widget (thanks to Hiromi)
- Batches of size 1 will be removed during training because of the issue with BatchNorm1d
- Confusion matrix show ints if
normalize=False
(default) RNNLearner.get_preds
return the preds in the right order (thanks to StatisticDean)num_features_model
now works with any modelresize_method
wasn't properly set when passed toImageDataBunch
reset
the RNNs at the beginning of each epoch inRNNTrainer
- add an argument
resize_method
that tellsapply_tfms
how to resize the image to the desired size (crop, pad, squish or no) - all the image dataset have an
image_opener
attribute (defaultopen_image
) that can be changed. TheSegmentationDataset
has amask_opener
attribute add_test
andadd_test_folder
in data block API
- jupyter et al no longer forced dependencies
verify_images
can now resize images on top of checking they're not broken- LR finder plot now uses python scientific notation instead of math superset notation
ImageDataBunch.from_df
doesn't change the dataframe
- Fix jupyter dep version
- Add tiny datasets
- remove wrong
Fbeta
- fix implementation of
fbeta
ImageDataBunch.single_from_classes
to allow single image predictionsDatasetBase
hasset_item
andclear_item
to force it to always returnitem
DatasetBase
uses abstract_get_x
and_get_y
batch_size
property in DeviceDataLoaderClassificationLearner.predict
to get prediction on a single item- Monkey-patched torch.Tensor so matplotlib works
Learner.create_unet
- Data block API
validate
now takes optionaln_batch
create_cnn
now returns aClassificationLearner
return_path
flag toLearner.save
ImageDataBunch.show_batch()
now works for every type of dataset, removesshow_images
andshow_xy_images
as a result- Monkey-patched torch.utils.data.dataloader.DataLoader to create a passthrough to the dataset
max_workers
fordownload_images
- Change the arguments of
ObjectDetectDataset
to make it consistent with the rest of the API, changes the return ofget_annotations
to go with it
- remove empty classes in
ImageDataBunch.from_folder
ConvLearner
ctor is replaced by a function calledcreate_cnn
Learner
objects now determine from the loss function if there is something to add on top of the models to get the true predictions
- Add
recurse
flag toget_image_files
show_xy_images
takes tensors instead of Image- Add
classes
to SegmentationDataset get_preds
now return the true probabilitiesTTA
averages the probabilities and not the last activations of the modelClassificationInterpretation
has been changed accordingly and thesigmoid
argument has been deprecated
- Make
pred_batch
faster and remove redundent*
- Bug in
Learner.pred_batch
- Bug in
model_sizes
(thanks to dienhoa) - Bug in
RNNLearner.classifier
when used on a multilabel dataset
download_images
: multi-process download of a file or URLsverify_images
: multi-process verification of directory of images with optional deletion
ImageDataBunch.from_folder
now takesvalid_pct
- master bar support in
download_url
- various fixes to support the latest of
fastprogress
Learner.normalize()
(without args) stores calculated stats inLearner.stats
pred_batch
moved tobasic_train
and fixed for multiple inputslr_find()
prints the next step to type when completed- New version of fastprogress used; doesn't require ipywidgets
- Removed
cifar_norm
,cifar_denorm
,imagenet_norm
,imagenet_denorm
- pretrained language model is now downloaded directly in the .fastai/models/ folder. Use
pretrained_model=URLs.WT103
- add an argument
stop_div
toLearner.lr_find()
to prevent early stopping, useful for negative losses - add an argument
convert_mode
toopen_mask
andSegmentationDataset
to choose the PIL conversion mode of the masks
URLs.download_wt103()
has been removed
- change TextDataBunchClass method [
from_ids_files
,from_tokens
,from_df
,from_csv
,from_folder
] so that classes argument is passed to the call to TextDataset - Strip space from file name when CSV has spaces
- Handle missing
loss_func
attr - Pass on the
use_bn
parameter inget_tabular_learner
- Bad handling when final batch has size of 1
- rolled back numpy dependency to >=1.12 (anaconda package has a upper pin on it) and to pip>=9.0.1, the old version are buggy but should be ok for fastai
- Added missing
pyyaml
dependency to conda too
- Use
spacy.blank
instead ofspacy.load
to avoid having to download english model
- Added missing
pyyaml
dependency
EarlyStoppingCallback
,SaveModelCallback
,TerminateOnNaNCallback
(initial draft: fredguth)datapath4file(filename)
returns suitable path to store or find data file calledfilename
, using config file~/.fastai/config.yml
, and default data directory~/.fastai/data
, unless./data
exists and contains that file- MSELossFlat() loss function
- Simple integration tests for all applications
data
is now calledbasic_data
to avoid weird conflicts when naming our data objects datadatasets.untar_data
anddatasets.download_data
will now download to fastai home directory~/.fastai/data
if the dataset does not already exist locally./data
- add
dep_var
column intest_df
if it doesn't exists (Kevin Bird) - backwards=True when creating a LanguageModelLoader (mboyanov)
- Not released
- New class
ImagePoints
for targets that are a set of point coordinates - New function
Image.predict(learn:Learner)
to get the activations of the model inLearner
for an image - New function
Learner.validate
to validate on a given dl (defaultvalid_dl
), with maybe new metrics or callbacks - New function
error_rate
which is just1-accuracy()
- All vision models are now in the
models
module, including torchvision models (where tested and supported). So usemodels
instead oftvm
now. If your preferred torchvision model isn't imported, feel free to test it out and tell us on the forum if it works. And if it doesn't, a PR with a test and a fix would be appreciated! ImageBBox
is now a subclass ofImagePoints
- All metrics are now
Callback
. You can pass a regular function likeaccuracy
that will get averaged over batch or a fullCallback
that can do more complex things - All datasets convenience functions and paths are inside the
URLs
class URLs
that are a sample have name now suffixed with_SAMPLE
- Fix
WeightDropout
in RNNs whenp=0
pad_collate
gets itskwargs
fromTextClasDataBunch
- Add small
eps
tostd
inTabularDataset
to avoid division by zero fit_one_cycle
doesn't take other callbacks- Many broken docs links fixed
- Last release without CHANGES updates