Neural networks의 convergence, convexity에 대한 논문

아직 많이 논문을 읽어보지는 못했지만

최근 Neural network의 수렴성에 대한 연구가 이루어지는 것들이 있는 것 같아서 논문 목록을 미리 작성해놓는다.

NN에서 마주치는 어려움 중 하나는 수렴성에 대한 문제이다.

최적화도 nonlinear programming에서 수렴이 어렵기 때문에 이를 convex화시켜서 해결하고자 하는 시도들이 많은데 인공지능 분야도 실제 상황에서 활용되기 위해서는 충분히 수렴이 가능한지에 대한 연구도 많이 필요할 것으로 보인다.

Neural network의 convexity에 대한 연구

Milne, T. (2019). Piecewise strong convexity of neural networks. Advances in Neural Information Processing Systems, 32.

이 논문에서는 loss function을 기존의 loss function과 weight의 Euclidean norm의 합으로 정의하였다. 그 다음, ReLU와 Weight 조합만으로 특정 set 안에서 문제가 convex하다는 것을 밝혀냈다. 다만 bias가 없기 때문에 universal approximation theorem을 만족시키는지는 알기 어렵다.

Ergen, T. & Pilanci, M. 두 저자의 convexification of neural networks에 대한 논문을 많이 봐서 기록한다.

Ergen, T., and Pilanci, M., “Convex Optimization for Shallow Neural Networks,” presented at the 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2019. https://doi.org/10.1109/ALLERTON.2019.8919769

T. Ergen and M. Pilanci, “Convex duality and cutting plane methods for over-parameterized neural networks,” in OPT-ML workshop, 2019.

T. Ergen and M. Pilanci, “Revealing the structure of deep neural networks via convex duality,” in Proceedings of the 38th international conference on machine learning, in Proceedings of machine learning research, vol. 139. PMLR, Jul. 2021, pp. 3004–3014.

M. Pilanci and T. Ergen, “Neural networks are convex regularizers: Exact polynomial-time convex optimization formulations for two-layer networks,” in Proceedings of the 37th international conference on machine learning, in Proceedings of machine learning research, vol. 119. PMLR, Jul. 2020, pp. 7695–7705. [Online]. Available: https://proceedings.mlr.press/v119/pilanci20a.html

T. Ergen and M. Pilanci, “Global optimality beyond two layers: Training deep relu networks via convex programs,” in International conference on machine learning, PMLR, 2021, pp. 2993–3003.

위의 두 개가 가장 기초적인 논문이라서 현재는 읽는 상태다.

위 저자들의 깃허브 링크 역시 기록해놓는다.

수렴성에 대한 연구

S. Oymak and M. Soltanolkotabi, “Toward moderate overparameterization: Global convergence guarantees for training shallow neural networks,” IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 1, pp. 84–105, 2020.

저작자표시 비영리 변경금지

'연구 Research > 인공지능 Artificial Intelligent' 카테고리의 다른 글

[JAX] JAX에서 gradient 추척을 멈추는 방법 (0)	2023.08.22
[PyTorch] 인공지능 재현성을 위한 설정과 주의할 점 (0)	2023.08.11
[JAX] JAX 기반 Neural ODE 라이브러리 : diffrax (0)	2023.07.28
[JAX] 학습한 모델 저장 및 로드 (0)	2023.06.19
[JAX] 병렬컴퓨팅 예제 - jax.pmap으로 신경망 학습 예제 (0)	2023.06.13

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Neural networks의 convergence, convexity에 대한 논문

Neural network의 convexity에 대한 연구

수렴성에 대한 연구

'연구 Research > 인공지능 Artificial Intelligent' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역