Releases · ucbepic/docetl

04 Dec 04:34

shreyashankar

0.2.0

82a2102

0.2.0 Latest

Latest

What's Changed

Sample by @redhog in #92
Outliers by @redhog in #91
Sample (+ Outlier Functionality) Operation by @shreyashankar in #100
#91 document > item renaming by @garuna-m6 in #103
chore: update dependency versions by @shreyashankar in #105
docs: add 'output' argument to ResolveOp code eg by @goutham794 in #106
fix: edit agent for synth resolve task by @shreyashankar in #109
fix: update python api with cluster & sample ops by @shreyashankar in #113
docs: add sample and cluster to docs by @shreyashankar in #114
New api by @redhog in #115
fix: make docs work by @shreyashankar in #119
feat: adding human in the loop for split-map-gather decomp by @shreyashankar in #120
fix: cache partial pipeline runs by @shreyashankar in #122
mark as flaky test by @shreyashankar in #123
only compare distinct pairs in resolve by @shreyashankar in #124
LinkResolveOperation by @redhog in #117
Fix Resolve and Map progress bars by @michielree in #126
Better auto batching for resolve LLM calls by @sushruth2003 in #128
Merge auto batching PR by @shreyashankar in #129
v1 of the UI! by @shreyashankar in #118
Upgrade litellm version to v1.51.0-stable by @Tendo33 in #131
feat: adding batching for map and filter calls by @shreyashankar in #133
docs: link filter to map by @shreyashankar in #135
fix: optimizer bug where the reduce operation can't be optimized with azure by @shreyashankar in #136
fix: only call os makedirs on non empty paths by @shreyashankar in #137
UI: add basic chat-based assistant by @shreyashankar in #139
fix: clear and run button should also bypass cache by @shreyashankar in #140
feat: UDFs support added by @staru09 in #138
Merge staging into MAIN by @shreyashankar in #141
chore, load envs from current directory by @plpycoin in #142
feat: add optimizer in the UI by @shreyashankar in #143
chore: Use environment variable configuration files to set host, port by @plpycoin in #146
fix: render cells in markdown and fix resizable panels by @shreyashankar in #148
hotfix: an error occurred when running make run-ui by @plpycoin in #152
fix: switch crypto uuid to regular uuid by @shreyashankar in #155
feat: provide defaults in the UI chat by @shreyashankar in #156
fix: allow reduce_key types to be lists by @shreyashankar in #162
fix: allow user to pass in litellm completion kwargs by @shreyashankar in #163
Sagemaker doesn't yet support tools by @njbrake in #165
fix: save validation and gleaning by @shreyashankar in #167
Remove unnecessary console log statements by @shreyashankar in #168
chore: update docs to link to paper by @shreyashankar in #174
feature: add automatic optimization check to the UI (opt in) by @shreyashankar in #175
fix: ts errors by @shreyashankar in #176
fix: ts errors by @shreyashankar in #177
feat: tie histograms to output types by @shreyashankar in #178
Change for issue: #180 by @yogitha2023 in #181
make test less flaky by @shreyashankar in #182
feat: add code operations to the ui (#169) by @shreyashankar in #183
hotfix: Import declaration conflicts with local declaration of 'Operation'. by @plpycoin in #185
chore: make output visualizations better by @shreyashankar in #186
chore: edit copy for the UI by @shreyashankar in #189
feat: add pdf upload for the UI by @shreyashankar in #190
fix sampling in second op onwards by @shreyashankar in #192
Receive a customized parameter to specify the number of concurrent running threads for the code_* operations. by @plpycoin in #195
chore: fix optimizer and API for user study by @shreyashankar in #196
add python multipart to requirements by @samelamin in #197
fix: add docs to describe how to set up .env.local by @shreyashankar in #199
Intermediates fix by @samelamin in #200
fix: change default settings by @shreyashankar in #202
feat: Add azure doc intelligence to PDF upload in the UI by @shreyashankar in #203
feat: change histograms to be bar charts for categorical columns by @shreyashankar in #204
ensure the value is converted to a string when it's not an object by @plpycoin in #206
feat: add docker file by @shreyashankar in #205
feat: add basic llm call observability to the UI by @shreyashankar in #209
feat: have global system prompt and decription by @shreyashankar in #210
fix: small errors in user study by @shreyashankar in #211
fix: support markdown files by @shreyashankar in #213
Add variable descriptions that limit the number of concurrent threads by @plpycoin in #215
feat: add column view dialog by @shreyashankar in #214
Update reduce folding instruction to be clearer by @shreyashankar in #217
fix: make histogram calculation and rendering less blocking by @shreyashankar in #218
edit system prompt in prompt improvement by @shreyashankar in #223
fix: edit system prompt for prompt rewriter by @shreyashankar in #224
Fix cache naming by @sushruth2003 in #220
refactor recursive optimization for map operations by @shreyashankar in #225
feat: adding namespaces by @shreyashankar in #226

New Contributors

@garuna-m6 made their first contribution in #103
@goutham794 made their first contribution in #106
@michielree made their first contribution in #126
@sushruth2003 made their first contribution in #128
@Tendo33 made their first contribution in #131
@plpycoin made their first contribution in #142
@njbrake made their first contribution in #165
@yogitha2023 made their first contribution in #181
@samelamin made their first contribution in #197

Full Changelog: 0.1.7...0.2.0

Contributors

redhog, michielree, and 10 other contributors

Assets 2

14 Oct 02:01

shreyashankar

0.1.7

b0ef0c1

0.1.7

What's Changed

Add Operation Hash and Caching Functionality by @shreyashankar in #61
docs: improving documentation for pipeline api by @shreyashankar in #62
refactor: adding website code by @shreyashankar in #65
(partial) fix: add exponential backoff for rate limit errors by @shreyashankar in #66
fix: enable gleaning llm calls to work by @shreyashankar in #70
Added llama-index based parsers by @redhog in #71
Merging staging to main by @shreyashankar in #74
feat: add pdfgpt to parse PDFs by @staru09 in #67
Merging staging to main (from add gpt_pdf) by @shreyashankar in #76
fix: disable additional properties for gemini by @shreyashankar in #73
Throttle by @redhog in #64
feat: support rate limits by @shreyashankar in #79
feat: add verbose parameter for gleaning by @shreyashankar in #80
Parsers can now return any number of fields, and can access the whole item by @redhog in #81
Merge staging to main (after parsers refactor) by @shreyashankar in #82
docs: add sample parameter by @shreyashankar in #87
Clustering by @redhog in #84
Merge staging to main (after adding cluster operator) by @shreyashankar in #88
feat: output to csv if user specifies a csv file by @shreyashankar in #89
Rename internal methods by @redhog in #90
Nits for cleaning up API. by @shreyashankar in #93
refactor: move validation and gleaning into call llm by @shreyashankar in #98
Staging to main by @shreyashankar in #99
feat: add reduce operation lineage by @shreyashankar in #101
fix: change gleaning prompt to validation_prompt by @shreyashankar in #102

New Contributors

@staru09 made their first contribution in #67

Full Changelog: 0.1.6...0.1.7

Contributors

redhog, shreyashankar, and staru09

Assets 2

03 Oct 21:31

shreyashankar

0.1.6

f2ddb8c

0.1.6

What's Changed

docs: fix resolve docs by @shreyashankar in #27
docs: link to ollama chat by @shreyashankar in #28
feat: show better progress bars for operations by @shreyashankar in #30
Add batching support to map operations with configurable parameters by @orban in #16
feat: implement batch limit in map operations by @shreyashankar in #31
Add Dataset Class and Parsing Tools by @shreyashankar in #32
docs: improve clarity for custom parsing by @shreyashankar in #34
Add Azure Document Intelligence Read Tool by @shreyashankar in #36
fix: read .env from the user's cwd and change tool schema so ollama llama models work better by @shreyashankar in #38
Bugfix for sqlite3 operation error in cache by @redhog in #40
fix: make diskcache reads thread-safe by @shreyashankar in #42
fix: template in tutorial.md by @shreyashankar in #43
RateLimit error by @redhog in #39
feat: add paddleocr by @shreyashankar in #44
Entrypoints by @redhog in #45
fix: dont cache results of bad llm calls by @shreyashankar in #52
fix: default to gpt 4o tokenizer by @shreyashankar in #57
feat: print out LLM message history and tools when there's an InvalidOutputError by @shreyashankar in #53
feat: don't use tool calling for ollama/OSS models if the output schema is just one param by @shreyashankar in #59
docs: add documentation for using split gather pipeline by @shreyashankar in #60

New Contributors

@orban made their first contribution in #16
@redhog made their first contribution in #40

Full Changelog: 0.1.5...0.1.6

Contributors

redhog, orban, and shreyashankar

Assets 2

30 Sep 03:44

shreyashankar

0.1.5

4cae2a4

0.1.5

What's Changed

fix: add error messages if model doesnt support tool calling by @shreyashankar in #26

Full Changelog: 0.1.4...0.1.5

Contributors

shreyashankar

Assets 2

30 Sep 03:07

shreyashankar

0.1.4

a8c7ac8

v0.1.4

What's Changed

fix: manually try to parse ollama outputs, even if it is not valid json by @shreyashankar in #25

Full Changelog: 0.1.3...0.1.4

Contributors

shreyashankar

Assets 2

29 Sep 23:08

shreyashankar

0.1.3

b26fab4

v0.1.3

What's Changed

quality of life: show error when trying to execute resolve without blocking by @shreyashankar in #12
Optionally persist intermediates for reduce by @shreyashankar in #14
Add a save config method to the Python API by @shreyashankar in #15
Remove unnecessary name parameter from parallel map operation. by @shreyashankar in #17
Add podcast to readme by @shreyashankar in #18
fix: remove openai client call in utils.py by @shreyashankar in #22
Add Configurable Timeouts for Operations and Ollama Integration Documentation by @shreyashankar in #24

Full Changelog: 0.1.2...0.1.3

Contributors

shreyashankar

Assets 2

23 Sep 05:09

shreyashankar

0.1.2

5f91dd4

v0.1.2

This release fixes a bug where the typer dependency was missing. It also adds a Python API.

Assets 2

17 Sep 19:09

shreyashankar

0.1.1

6b1f803

0.1.1 Pre-release

Pre-release

Full Changelog: 0.1.0...0.1.1

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

Releases: ucbepic/docetl

0.2.0

What's Changed

New Contributors

Contributors

0.1.7

What's Changed

New Contributors

Contributors

0.1.6

What's Changed

New Contributors

Contributors

0.1.5

What's Changed

Contributors

v0.1.4

What's Changed

Contributors

v0.1.3

What's Changed

Contributors

v0.1.2

0.1.1