Skip to main content

Tube Archivist

A self-hosted YouTube archive with search and media management.

Tube Archivist is a self-hosted application for downloading, indexing, and managing YouTube videos. It uses Elasticsearch for full-text search across your archived content and provides a Netflix-style web UI for browsing channels and videos. Self-hosting gives you a permanent, searchable personal archive of YouTube content independent of YouTube's availability.

Alternatives considered

Cloud Hosted

ToolOpen SourceFree TierMonthly Cost
YouTube PremiumNoNoFrom $13.99/mo

Self Hosted

ToolOpen SourceFull FeaturesNotes
yt-dlpYesYesCLI only; no web UI or indexing
InvidiousYesYesProxies YouTube; does not archive locally

Installation

Architecture

  • Deployments: tubearchivist (main app), archivist-es (Elasticsearch), Valkey (Redis)
  • Images: bbilly1/tubearchivist, bbilly1/tubearchivist-es, valkey/valkey:9.0.3-alpine (all digest-pinned)
  • Storage: Longhorn PVC (cache, k8up.io/backup: "true") for Elasticsearch data; NFS PVs from TrueNAS for downloads and video archive
  • Networking: ClusterIP services for app and ES; HTTPRoute via internal gateway

Security

  • Tube Archivist runs as runAsUser: 1000, runAsGroup: 1000
  • Valkey: runAsUser: 999, runAsNonRoot: true, allowPrivilegeEscalation: false, capabilities dropped
  • Secrets SOPS-encrypted with age

Updates

Managed by Renovate. All images are digest-pinned.

Data Management

  • PVC: cache (Longhorn-encrypted, k8up.io/backup: "true") for Elasticsearch index data
  • NFS: TrueNAS NFS PVs for downloads staging and video archive
  • Backups: k8up Schedule backs up Elasticsearch Longhorn PVC to Hetzner S3.

User Management

No OIDC configured. User accounts managed through the Tube Archivist web UI.

Configuration Management

  • ES_URL, REDIS_CON, HOST_UID, HOST_GID, TA_HOST from SOPS-encrypted secret
  • Elasticsearch and Redis connection details injected from secret

Administration

Usage

Subscribe to YouTube channels in the UI and trigger downloads. Tube Archivist queues downloads via Redis and stores videos on the NFS share. Search your archive using the Elasticsearch-backed full-text search. Browser extension can queue videos from YouTube directly.

Cluster-specific deviations from the above live in the per-cluster README — see k8s/apps/talos/tube-archivist/README.md.

Cluster Deployment

Tube Archivist — Talos cluster

Cluster-specific notes only. General product info, "why we use it", and alternatives live in docusaurus/docs/apps/tube-archivist.mdx.

Deviations from defaults

Defaults live in docusaurus/docs/apps/tube-archivist.mdx — document anything this cluster does differently here, with a one-line reason.

Kubernetes Metadata
  • Image: bbilly1/tubearchivist-es@sha256:9da63fb1973ec3d57daf6916be948eddd0d8a404cc8e447c938480c85fe2c554
  • Image: bbilly1/tubearchivist@sha256:dfe723cf008520e1758ecc3e59e6ea8761dd10d5bb099cd87289e80f5bd66567
  • Image: valkey/valkey:9.1.0-alpine@sha256:a35428eba9043cc0b79dbe54100f0c92784f2de00ad09b01182bfb1c5c83d1bd
Rendered manifests (kustomize build)
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
kustomize.toolkit.fluxcd.io/force: enabled
name: tubearchivist
namespace: tube-archivist
spec:
replicas: 1
selector:
matchLabels:
app: tubearchivist
strategy:
rollingUpdate: null
type: Recreate
template:
metadata:
labels:
app: tubearchivist
spec:
containers:
- env:
- name: ES_URL
value: http://archivist-es:9200
- name: REDIS_CON
value: redis://valkey:6379
- name: HOST_UID
value: '1000'
- name: HOST_GID
value: '1000'
- name: TA_HOST
value: https://tube-archivist.web.kueber.eu
- name: TZ
value: America/New_York
- name: TA_USERNAME
valueFrom:
secretKeyRef:
key: TA_USERNAME
name: tubearchivist-secrets
- name: TA_PASSWORD
valueFrom:
secretKeyRef:
key: TA_PASSWORD
name: tubearchivist-secrets
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
key: ELASTIC_PASSWORD
name: tubearchivist-secrets
image: bbilly1/tubearchivist@sha256:dfe723cf008520e1758ecc3e59e6ea8761dd10d5bb099cd87289e80f5bd66567
livenessProbe:
exec:
command:
- sh
- '-c'
- |
code=$(curl -k -s -o /dev/null -w "%{http_code}" http://localhost:8000)
[ "$code" -eq 200 ] || [ "$code" -eq 401 ]
failureThreshold: 5
initialDelaySeconds: 15
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 2
name: tubearchivist
ports:
- containerPort: 8000
name: http
readinessProbe:
exec:
command:
- sh
- '-c'
- |
code=$(curl -k -s -o /dev/null -w "%{http_code}" http://localhost:8000)
[ "$code" -eq 200 ] || [ "$code" -eq 401 ]
failureThreshold: 5
initialDelaySeconds: 15
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 2
volumeMounts:
- mountPath: /youtube
name: videos
subPath: youtube
- mountPath: /cache/download
name: downloads
subPath: tube-archivist
- mountPath: /cache
name: cache
volumes:
- name: videos
persistentVolumeClaim:
claimName: tube-archivist-truenas-nfs-videos
- name: downloads
persistentVolumeClaim:
claimName: tube-archivist-truenas-nfs-downloads
- name: cache
persistentVolumeClaim:
claimName: cache
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: archivist-es
namespace: tube-archivist
spec:
replicas: 1
selector:
matchLabels:
app: archivist-es
serviceName: archivist-es
template:
metadata:
labels:
app: archivist-es
spec:
containers:
- env:
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
key: ELASTIC_PASSWORD
name: tubearchivist-secrets
- name: ES_JAVA_OPTS
value: '-Xms1g -Xmx1g'
- name: xpack.security.enabled
value: 'true'
- name: discovery.type
value: single-node
- name: path.repo
value: /usr/share/elasticsearch/data/snapshot
image: bbilly1/tubearchivist-es@sha256:9da63fb1973ec3d57daf6916be948eddd0d8a404cc8e447c938480c85fe2c554
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- sh
- '-c'
- |
code=$(curl -k -s -o /dev/null -w "%{http_code}" http://localhost:9200)
[ "$code" -eq 200 ] || [ "$code" -eq 401 ]
initialDelaySeconds: 60
periodSeconds: 20
name: es
ports:
- containerPort: 9200
name: http
readinessProbe:
exec:
command:
- sh
- '-c'
- |
code=$(curl -k -s -o /dev/null -w "%{http_code}" http://localhost:9200)
[ "$code" -eq 200 ] || [ "$code" -eq 401 ]
initialDelaySeconds: 60
periodSeconds: 20
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: es-data
securityContext:
fsGroup: 1000
fsGroupChangePolicy: OnRootMismatch
runAsGroup: 1000
runAsUser: 1000
volumeClaimTemplates:
- metadata:
annotations:
k8up.io/backup: 'true'
name: es-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: longhorn-encrypted
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: valkey
namespace: tube-archivist
spec:
replicas: 1
selector:
matchLabels:
app: valkey
serviceName: valkey
template:
metadata:
labels:
app: valkey
spec:
containers:
- args:
- valkey-server
image: valkey/valkey:9.1.0-alpine@sha256:a35428eba9043cc0b79dbe54100f0c92784f2de00ad09b01182bfb1c5c83d1bd
livenessProbe:
initialDelaySeconds: 10
periodSeconds: 10
tcpSocket:
port: 6379
name: valkey
ports:
- containerPort: 6379
name: client
readinessProbe:
initialDelaySeconds: 3
periodSeconds: 5
tcpSocket:
port: 6379
resources:
limits:
memory: 512Mi
requests:
cpu: 50m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
privileged: false
volumeMounts:
- mountPath: /conf
name: conf
- mountPath: /data
name: data
securityContext:
fsGroup: 1000
fsGroupChangePolicy: OnRootMismatch
runAsGroup: 1000
runAsNonRoot: true
runAsUser: 999
seccompProfile:
type: RuntimeDefault
volumes:
- emptyDir: {}
name: conf
- emptyDir: {}
name: data